Wednesday, June 11, 2008

C++ VS. C#

 

There are a number of areas to contrast and compare advantages of C++ and C#.

There are also questions of managed C++ (C+/CLI) compared to unmanaged ("classic") C; for now we're comparing unmanaged C+ to C#, which uses a managed memory model.

Code Writing

Memory

C#'s strengths:

·         Automatic memory initialization and management. While coding little attention is needed for who owns what memory, meaning that coding is faster in general. The whole problem of freeing memory when handling an exception does not exist.

·         Memory freeing happens automatically in C# through garbage collection. Overloading pointers to perform reference counting can help ameliorate memory freeing problems in C++, but these schemes are often involved to get exactly right.

C++'s strengths:

·         You are in control of memory, though this feature can be a double-edged sword. You can exactly tell how much memory is needed in advance, and precisely control how that memory is laid out, if need be. There will never be any delays due to garbage collection.

Both languages can have the problem of holding onto memory after it should have been freed. In C++ this is common, as a destructor might not explicitly free memory it should have. However, C# can also hold onto memory; for example, an event might need to be unregistered for memory it references to be let go.

Syntax

C#'s strengths:

·         Single file: there is no .cpp/.h code vs. header file separation. This makes writing code simpler, and encourages a good programming style: it is so simple to add a method to a class that it is worth just doing so. In C++ adding a new method involves editing two files, adding a similar but not quite the same method descriptor to each, and compiling to check your work.

·         Extremely fast compilation. Large systems take minutes to compile, not an hour, leading to less programmer down time. For example, Autodesk Design Review, in C++, can take a solid hour to build it. Inventor can take hours to build. C# is fast enough that the whole build can be performed automatically whenever a checkin is done, quickly showing any build problems.

·         One set of common libraries. C# has a large number of classes supporting common operations, handling data and exceptions in a consistent way. C++ has STL, which is not as comprehensive or universal.

C++'s strengths:

·         Every programmer (pretty much) is familiar with the language and has used it for years. That said, short books such as O'Reilly C# Essentials book can ease the transition; even just reading a "differences article" like this one can get programmers most of the way there. C# is about 85% like C++, with a few new constructs and a few kinks ironed out.

·         You are "closer to the metal" with C++, in that you can intuit a bit better what operations are quick and what are not.

·         Integration of assembly code into inner loops of C++ code can make it faster. C# uses CLR, so assembly is unavailable.

Language Strengths

C#'s strengths:

·         Confusing and difficult to debug templates and #defines do not exist. Some design patterns are implemented by C#, such as C#'s delegates.

·         Reflection allows code to parse other data structures without much intervention from the user. This can allow, for example, simple dialogs to be automatically constructed and interpreted that allow someone debugging the code to quickly change parameter values.

·         #include files and the headaches of getting these to compile are almost non-existent in C#; it's "use" mechanism is geared for large projects.

·         There are many other little improvements of C# over C++: property set/get is a part of the language, there is direct support for events, enums can represent bits in a bit string and accessed correctly, "int" has a platform independent meaning, for example (important for 64-bit platforms), and so on. A good summary can be found here

C++'s strengths:

·         C#'s class reference system can sometimes be confusing to debug. Class references have no "pointer handle" to look at in the debugger, making it difficult to know what data is being examined. C++'s use of pointers means data can be precisely identified. (Note that you can achieve the same result while debugging C# by using the "Make Object ID" feature in the debugger, it's just a little more obscure to use - Steve Anderson.)

·         A number of C++ constructs can be used to perform the same functionality as C# (underneath the hood everything is machine code, so this statement is obviously true). In some cases C++ can be a bit more straightforward, see David Brownell's blog. That said, most programmers still use the default "new/delete" functionality, dumb pointers, etc. Informed C++ practice guidelines set by expert programmers can help avoid these problems.

Maintenance

C#'s strengths:

·         Automatic memory initialization and bounds checking. Some of the most difficult defects to track down (if they are noticed at all before the product is released) are those that occur sporadically, often only on release compiles of C++ code. These defects are almost always due to one of two problems: initialization or reading/writing out-of-bounds memory locations. C# avoids or identifies these problems early on: code will not compile if use of an uninitialized variable is detected, and execution will halt if an array overflow occurs.

C++'s strengths:

·         Over time a number of tools have been developed that can help the C++ developer, such as tools for finding uninitialized

Performance

As mentioned, C++ can be faster that C# for a number of operations.

There is a startup cost with getting C#'s CLR into machine code when the program is first executed. There is also a memory overhead for the CLR system itself, though this is a fixed cost so is not as important for larger applications. C++ does not have these costs.

An interesting fact is that C++ code is normally compiled for the least common denominator platform, e.g. a Pentium II processor, rather than for the commonly used CPU. C# uses CLR, which at least in theory can be compiled and optimized on the fly for the platform it is on. I do not know if this optimization is done in practice.

This is an interesting series of posts, starting here http://www.codinghorror.com/blog/archives/000299.html. The author notes: "This managed code is a line for line conversion in the dumbest possible way of his initial program with no attempt whatsoever to optimize anything." Yet it was more than 10x faster than the original C++ code. One iteration of optimization made the C# program considerably faster, though eventually the C++ program, after 6 iterations of optimization, is finally a bit faster than the C# code. In the process, the C++ coder had to write his own string class, file I/O code, memory allocator, international code support, etc. More around here: http://blogs.msdn.com/ricom/archive/2005/05/10/performance-quiz-6-chinese-english-dictionary-reader.aspx (find the rest here: http://blogs.msdn.com/ricom/archive/2005/05.aspx\), on the MSDN site.

Another relevant article is at http://www.xtremedotnettalk.com/showthread.php?t=83128 – tidbit that Tom Miller, lead dev. for managed !DirectX, says Managed !DirectX is 3-5% slower than unmanaged !DirectX. Tom writes about how the initial version of managed !DirectX had some problems, now fixed: http://msdn.microsoft.com/msdnmag/issues/05/08/EndBracket/. Managed !DirectX is not a toy, shipping products include those by http://www.koiosworks.com/.

One fairly comprehensive article with an older version of .NET (which is constantly improving):
http://www.ddj.com/184401976 - June 2005, in-depth comparison of common operations. C# is similar to C++ in performance, except for matrix multiplication (see http://www.ddj.com/dept/cpp/184401976?pgno=5 - why? Note this is not 4x4 matrix multiply, so doesn’t really carry over to us directly). The two summary graphs:

http://www.ddj.com/showArticle.jhtml?documentID=cuj0507bruckschlegel&pgno=10 – without math

http://www.ddj.com/showArticle.jhtml?documentID=cuj0507bruckschlegel&pgno=11 – with math, more like what we do.

Bottom line for “with math”: C++, 761.5; C# 2.0 beta 2, 890.25 – C# is 16% slower overall (and surprisingly faster in some cases, e.g. lists, object creation & destruction).

Information from Ian Ameline (Maya):

I've done some performance comparisons, and when it comes to single precision floating point math (very common in 3D graphics and image/signal processing) C# and .net are substantially slower than C++. The Perlin noise generator used in Maya is very performance critical – for many renders, over 60% of the time is spent in this code.

Intel C++

V9.1.035

552 cycles/result

perf mult = 1.00

MS C++

V14.00.50727.762

1078 cycles/result

perf mult = 1.95

MS C++.net

V2.00.50727.42

1866 cycles/result

perf mult = 3.38 (or 1.73 compared to native MS)

MS C#.net

 

1195 cycles/result

perf mult = 2.16 (or 1.11 compared to native MS)

These were measured on a 3Ghz P4 Prescott with an 800 Mhz fsb, Windows XP SP2, freshly booted, nothing else running. All tests were run 10 times, and the best time for each test was used. (Variance from run to run was less than 2%)

As a reference, on this platform, my hand coded SSE assembler version was 684 cycles/result – slower than the code generated by the Intel compiler (but slightly faster than the Intel V8 compiler.)

These numbers are consistent with what many are reporting – that .net is not that much slower than native code – the problem is that the native code they're comparing to is the garbage produced by the native MS C++ compiler.

Good native code is over 2x faster than .net on a number of problem domains of significant interest to us, and not many people are making that comparison.

This sort of code is typical for many of the performance critical paths of Maya.

(We compile all of our performance critical code with the Intel compiler. It does a substantially better job of generating fast code when compared to MS's compiler. I know what I'm talking about here; I spent 8 years working at IBM's research labs on the compiler teams as the architect for their x86 optimizing back end.)

So I went to look at why .net was so slow. The answer is that it does a very simple and naive translation to x87 floating point instructions. Virtually no optimizations to speak of.

The Intel compiler does a fairly smart job of using the SSE instructions for floating point, but does not vectorize this code – it's still scalar. When the Intel compiler does autovectorize, it gets another 2 to 3x performance out of your code. For our billow (clouds, etc.) texture generator, it's 3.5 times faster than MSVC 8, and about 6 times faster than C#.

The MS compiler is just too stupid to know that it doesn't need to promote all the single precision math to double precision. That's why it compares so badly to the Intel compiler.

Interfacing

It's clear that interfacing one C++ program with another C++ program is fairly straightforward, and a well-known quantity (though link errors can still be mysterious at times). C# to C++ and C++ to C# are less common ways of interfacing, and each has its issues and limitations.

Tools and Library Support

C#'s strengths:

·         Autocompletion of code just happens, by default, in Visual Studio. This same feature can be made to work for C+, but it's sometimes hit or miss. The ADR C+ developers complain about the incredible amount of time the latest Visual Studio takes to generate Intellisense databases for their C++ files, to the point that they will turn the feature off.

C++'s strengths:

·         Almost every outside library written will interface with C++, since the language is ubiquitous. A major problem for C# is the lack of commitment by Microsoft to fully support managed !DirectX 10, and managed wrappers for 64-bit machines.

Compile Time

C# code, or generally managed code, compiles in a small fraction of the time that it takes to compile comparable C++ code. While specific examples are not provided here, the difference is dramatic and leads to a substantial improvement in productivity, particularly for large projects. This speed is due in part to managed code being compiled to [http://en.wikipedia.org/wiki/Common_Intermediate_Language]an intermediate language], not machine code. Compiling is done at application run-time, through [http://en.wikipedia.org/wiki/Just-in-time_compilation]just-in-time compilation] (JIT).

Platforms

C# is tied to Microsoft and .NET, vs. C++ being platform agnostic. There are efforts to make C# work on Linux and Macs, e.g. the Mono Project, an open-source project supported by Novell.

 

No comments: