>

Why C and C++ are Bad

Why C and C++ are Awful Programming Languages

Imagine you are a construction worker, and your boss tells you to connect the gas pipe in the basement to the street's gas main. You go downstairs, and find that there's a glitch; this house doesn't have a basement. Perhaps you decide to do nothing, or perhaps you decide to whimsically interpret your instruction by attaching the gas main to some other nearby fixture, perhaps the neighbor's air intake. Either way, suppose you report back to your boss that you're done.

KWABOOM! When the dust settles from the explosion, you'd be guilty of criminal negligence.

Yet this is exactly what happens in many computer languages. In C/C++, the programmer (boss) can write "house"[-1] * 37 . It's not clear what was intended, but clearly some mistake has been made. It would certainly be possible for the language (the worker) to report it, but what does C/C++ do?

It finds some non-intuitive interpretation of "house"[-1] (one which may vary each time the program runs!, and which can't be predicted by the programmer),

(one which may vary each time the program runs!, and which can't be predicted by the programmer), then it grabs a series of bits from some place dictated by the wacky interpretation,

it blithely assumes that these bits are meant to be a number (not even a character),

it multiplies that practically-random number by 37, and

then reports the result, all without any hint of a problem.

[Based on an example by M. Felleisen]

(See also: a blog post by one Alex Gaynor, Modern C++ Won't Save Us.)

This is only one example of how C and C++ get some of the basics wrong. Even the authors of the definitive C++ Annotated Reference Manual (“ARM”) confess that there are problems with the basics (for example, “the C array concept is weak and beyond repair” [pg 212]). I highly recommend C++?? : A Critique of C++ for a detailed exposition of flaws (major and minor) of both C and C++.

For a technical critique of C/C++ from systems/compiler perspective (about the inherent danger of "undefined behavior" and how it arises surprisingly often even in innocuous C/C++ programs) see this excellent series of blog posts.

A Bad Choice For Students; An Alternative

Understanding what C or C++ programs do requires additional, reasonably detailed knowledge of how the computer's memory system ( e.g. heap vs. stack memory allocation; word alignment). This, by definition, is low-level; high-level languages ( e.g. Mathematica, Java, Scheme, Python) let you focus on computing an answer rather than on details of how the language might implement your program. (C was never intended to be a high-level language, but rather a low-level language with some high-level features on top of it. Such a language has its place, but not as a general-purpose language.) I'm not saying low-level programming is bad. But when learning how to program, the important thought process should be on how to take a problem-description to code, and not on how the machine stores bits. Low-level programming is very important for programmers who interface write device drivers, and for compiler writers, etc. But these applications account for a very small portion of all programs written. Let beginning programmers learn the fundamentals, and then those particular students who need it can take a later course in low-level programming.

heap stack memory allocation; word alignment). This, by definition, is low-level; high-level languages ( Mathematica, Java, Scheme, Python) let you focus on computing an answer rather than on details of how the language might implement your program. (C was never intended to be a high-level language, but rather a low-level language with some high-level features on top of it. Such a language has its place, but not as a general-purpose language.) Unlike some languages, C and C++ are extremely permissive about what is a legal program. This flexibility might be nice for professionals, but for beginners it just means that typos tend to cause mysterious behavior, rather than signalling errors. In my teaching experience, I've often seen students baffled for hours, because they accidentally used a comma somewhere, rather than a semicolon. They often flail for hours, randomly adding or removing keywords they've heard of, like static or public or & , and seeing if that happens to solve their problem. This sort of flailing doesn't help anybody learn, and it's the result of a language which assumes the programmer doesn't ever make mistakes or need help. A language for students should flag advanced or ambiguous constructs as probable typos. For instance, it's not obvious that in i = v[i++] , the final value of i is undefined [C++ ARM, p.46]. It's not difficult for a language to warn you if you write this, but no C++ compilers choose to.

Programming is a difficult task, learned over months and years. Object-oriented programming (the “++” part of “C++”) is a more advanced topic which is important for larger programs, but is best taught after the fundamentals have been learned.

In Mathematica, two billion plus two billion is four billion. In Java, it's defined to always be -293 million (approx). In C/C++, it's defined to be whatever answer gets returned, and will vary from machine to machine. Similarly, an example from the Java Langauge Specification p. 308: “it is not correct that 4.0*x*0.5 [is the same as] 2.0*x ; while roundoff happens not to be an issue here, there are large values of x for which the first expression produces infinity (because of overflow) but the second expression produces a finite result.” (Again, the Java spec at least defines what the answer should be in all cases, unlike C++ where this is left to vary between platforms.) The point is not that there are good reasons why some languages choose (unlike Mathematica or Scheme) to use imperfect arithmetic, but rather that when teaching a student how to decompose a problem into functions and how to program effectively, it's purely a digression to have to talk about numeric issues stemming from the language's choice of internal representation of numbers. This approach encourages the view that programming is a low-level activity, contradicting 60 years of working towards higher-level languages.

C++ is a large language, with many features, and requiring many statements in beginner programs whose meaning is inscrutable to the beginner.¹ (C++ has 68 operators, with 18 levels of precedence²; compare to Scheme, which has no levels of precedence, no needless distinction between function and operators, instead using parentheses consistently to mean “call a function”. Learning about all these levels has nada to do with the problem which the program is trying to solve.) The C++ standardization committee itself admits [X3J16 92] “C++ is already too large and complicated for our taste.” (Compare to Scheme, which has zero operators — everything is a function, and beginners don't waste time wondering if a certain construct is allowed in a certain context, or get surprised by precedence rules.)

So why is C++ so prevalent?

Given these known flaws with C/C++, why is there the popular misconception — among too many programmers and public alike — that C++ is a good language? I genuinely am at a loss to explain it. But here's my suspicion: When C/C++ programmers, used to walking the tightrope without a net, see that a language like Java or Scheme is doing extra work (verifying that any additions really are given numbers instead of strings, making sure arrays indices are legal, etc.), their reaction is “ugh, the computer is doing so much extra work, my programs will run too slow!” Indeed, benchmark programs do run faster in C or C++.

But there are a number of things to keep in mind: It is well-documented that development time is much longer in C/C++, since bugs creep in more easily. Hence, cost is also higher for C/C++ programs. (Many C/C++ projects have never been completed because of obscure memory bugs.) I'd rather have a slower, correct program than one which finds a wrong answer more quickly :-) .

Or even, how important is it to have fast programs? I don't know about you, but when I think about it, most of my wait-time behind the computer is due to my slow typing, or thinking, or waiting for info to download. I've spent much less time waiting for a calculation to finish than I have waiting for my computer to re-boot, or re-typing data which was lost because of a crash. (At the current moment, my netscape is unusable, complaining about pointer-based errors “invalid Pixmap” and “invalid GC parameter”. I'll have to try re-installing. Grrr.)

This is not to say that some applications require high performance — voice recognition, drafting, visualization of real-time CAT scan data, modeling star evolution or wind tunnels. Yes, C/C++ can sometimes give that performance better than other languages. And expert programmers using C/C++³ for those situations is fine. Indeed, taking prototype code and compiling or re-implementing them for efficiency is one of the prime goals of computer scientists. But such programming (and intensive debugging) is not the best place for the effort of an astronomer or medical researchers. Myself, I rarely or never run those types of programs; most of my time waiting on the computer is waiting for a page to download, not a slow program.

After talking repeatedly with people who tout C++'s run-time efficiency while dismissing its lack of safety, I've seen that they often have a couple of other attitudes:

First, that bugs and crashes are an acceptable or inevitable part of computers. This is an outright lie, and it is foisted off onto the public, who feel forced (for compatability reasons) to buy from only a few major software firms. The public becomes resigned to poorly-written products and crashes, vindicating the initial attitude.

Second, these people exhibit a form of programmer machismo: “Other people might need the computer to make safety checks as their program runs, but not me! I'm smarter and better than all those thousands of other (more experienced) programmers who've shipped bugs in their products.”

Writing large software systems bug-free is still a task the industry is learning. But having casual programmers learn C or C++, instead of a high-level language, is not the answer!

Another example

For a much more detailed argument on the shortcomings of C and C++, see Ian Joyner's C++?? : A Critique of C++ , which includes examples of both flaws inherited from C and flaws introduced in C++. For example, he correctly points out that constructs like:

// s1, s2 are char* // (intended as strings, not ptrs to an individual char). while (*s1++ = *s2++); might look optimal to C programmers, but are the antithesis of efficiency. Such constructs preclude compiler optimisation for processors with specific string handling instructions. A simple assignment is better for strings, as it will allow the compiler to generate optimal code for different target platforms. If the target processor does not have string instructions, then the compiler should be responsible for generating the above loop code, rather than requiring the programmer to write such low level constructs. The above loop construct for string copying is contrary to safety, as there is no check that the destination does not overflow, again an undetected inconsistency which could lead to obscure failures. The above code also makes explicit the underlying C implementation of strings, that are null terminated. Such examples show why C cannot be regarded as a high level language, but rather as a high level assembler.

You can certainly find supporters of C++. But they tend to either misunderstand issues, or have a more relaxed attitude towards unnecessary bugs in commercial products. For instance, choosing a random page from the above link, we find the assertion

ANSI C makes type safety optional. C++ makes it mandatory. In C++ it is very difficult (not impossible) to violate the type system.

class Party { /* ...class details... */ }; class Trouble { /* ...class details... */ }; Party *invitation = (Party*) (new Trouble()); (a full example)

invitation

Party

Trouble

invitation

...type errors in C are often the causes of strange bugs that take weeks or months to find, and that exhibit transient and misleading behavior. They often foul the stack or heap and cause eventual failure several million instructions after the precipitating event. Such bugs are the hallmark of poor quality software.

The above citation also asserts:

Why did C succeed? Because it cut through all the crap. Instead of fighting for “purity”, the authors of C created a language that could be used. They never contended that it was perfect or pure,

finger

(2001.jan)

Mozilla guidelines to assure your C++ is portable. Note that one of Mozilla's software architects says:

[Abelson and Sussman] is absolutely the best book on the topic I've ever seen. By the time you make it halfway through this book, you will have a very firm grasp on what object oriented programming is, because that's what this book is about — programming. This book uses Scheme as its instructional language, but please don't let that put you off. Because this book teaches you programming, not a particular language, and that's the point that so many mediocre programmers manage to get through school without understanding — that 'object oriented' is a style of programming, not a property of some particular language. This book is about problem solving techniques, not about figuring out where the semicolons go.

Java, while much better than C++, shares this same weakness: the smallest Java program requires about 12 keywords, each replete with meaning; a beginner must be told “put these words in your program in just this right order, else it won't work”. I've seen many students needlessly frustrated because it takes 30min to figure out their non-working program resulted from only inscribing eleven of the dozen necessary arcane glyphs. They may understand conceptually exactly what they want to do, but the arbitrary details of excessive syntax take out all the interest. (Some studies suggest that the prevalent teaching mode — encouraging arbitrary tinkering with little direction or meaning just trying to get it to work — is one reason for the prounounced gender bias seen in the field of computer science.)

Any teacher knows not to distract from a topic by introducing advanced details to a beginner. Common sense? You wouldn't know it from all the people who want to teach intro programming, but then use Java to do so. (back)

For comparison, Java has 46 operators with 15 levels of precedence. (back)

Indeed, any professional programmer who uses C++ will tell you that they use a disciplined subset of it, since without self-imposed guidelines it's easy to hang yourself with some of the language's “features”. You have to wonder, when style-guides for major, experience projects includes many rules of the form “don't use feature X of the language”; it indicates that the community has learned what language features are more harmful than helpful. (back)