Prerequisites: NOTE: This post is directed at software development specialists, not users of GreaterThanZero . Prerequisites are familiarity with C++ and at least one other object-oriented programming language.

Summary: It is my impression that a “culture of complexity” has taken root in parts of the C++ community. In this blog post, I examine the history of C++ in an attempt to understand this phenomenon.

I am a longtime user of C++, and I have been an author and speaker in the C++ community for many years. Yet, the number-crunching backend of GreaterThanZero is written in Java and not in C++. Does that mean I have joined the ranks of those who have turned their backs on C++? No it does not, and I can prove it: search for “C++ auto and decltype”, and you’ll find that the top result is an article that I wrote only a few weeks ago. The decision to implement the math backend of GreaterThanZero in Java was driven by other considerations, primarily the appeal of Google App Engine as a hosting platform. [1]

While I was writing the JavaScript and Java code for GreaterThanZero, I often found myself thinking, “Wow, this is really easy. Anybody could do this.” And then, I am ashamed to admit, I was disappointed. I was disappointed that there was no opportunity for me to prove that I was smarter than the next guy. I actually had to comfort myself with the thought that the mathematics that I was implementing was non-trivial.

Trying to impress people with the complexity of your code rather than with your software is a sign of immaturity and bad engineering. And I’ve been doing it. Needless to say, this is something that I need to work on. I am not going to blame my own weaknesses and character flaws on others. However, I think it is fair to say that the bad habit of wanting to impress others with the complexity of one’s code can develop and flourish only in a “culture of complexity,” a culture in which complex and hard-to-understand code is at least tolerated, if not admired and encouraged. While I don’t know how widespread it really is, I know from my own experience that such a culture of complexity has afflicted part of the C++ community.

I believe that if you want to fix a problem, you have to first understand its origins. Therefore, I will attempt to throw some light on the penchant for complexity that one encounters in parts of the C++ community by going back to the roots of C++ and object oriented programming.

There is a widespread consensus that the first object-oriented language was Simula 67. Earlier developments that contributed to the emergence of the OO paradigm took place in the Lisp-dominated environment at MIT, in particular, in the artificial intelligence group. [2] I don’t know to what extent the creators of Simula 67 were influenced by what was happening at MIT. However, intentionally or not, Simula 67 did exhibit some continuity to Lisp by embracing reference semantics and garbage collection (more about that later).

How does C++ fit into this picture? As of this writing, the Wikipedia article on Simula states that “The creator of C++, Bjarne Stroustrup, has acknowledged that Simula 67 was the greatest influence on him to develop C++, to bring the kind of productivity enhancements offered by Simula to the raw computational speed offered by lower level languages like BCPL.” [3] [4]

It is probably fair to call the emergence of Simula 67 and the OO paradigm an example of an innovative, if not revolutionary concept that grew organically out of existing, tried and true practices. I submit the fact that Lisp used reference semantics and garbage collection as evidence. In my opinion, the way that C++ combines the OO paradigm with the efficiency and low-level control that C affords is a different kind of endeavor. Trying to combine things that evolved independently of each other in the hopes of getting the best of both worlds is, in my opinion, an experiment. I am not implying any criticism when I say that; on the contrary, I believe that human curiosity and the everlasting search for optimal solutions dictate that such an experiment be conducted.

The experiment that is C++ has not failed. Way too many people have written way too much amazing and useful software in C++ for anybody to denounce C++ as a failure. [5] On the other hand, it is my opinion that the experiment is not a complete success either, in the sense that one could say, this works like a charm, it’s as if OO and C’s low-level control were made for each other. It appears to me that there is a noticeable amount of incompatibility and incongruity there, which manifests itself in two ways:

The C++ programmer has to spend a considerable amount of effort to no other end than to make the language work. The evolution of C++ is to a noticeable extent driven by the need to deal with the consequences of the incompatibilities and incongruities at the core of the language.

To see some evidence of this, consider any mainstream OO programming language other than C++ and Objective C. Think Java or C#. Consider this line:

x = y;

What does that do? It makes the variable x refer to the object that y is referring to, and it lets the garbage collector deal with the repercussions for the object that x was previously referring to. That’s called reference semantics, and it also kicks in during function argument passing. [6]

Now consider that same line in C++, and suppose that x and y are objects of user-defined type. What does the line do? Well, we don’t know, of course, without studying the source code of the user-defined type, as the copy assignment operator may have been overloaded. But there is a default behavior, and that is, roughly speaking, that the object referred to by x is put in the same state as the object referred to by y. Technically, this is achieved by applying the copy assignment operator recursively to the members of the object until basic types like int are encountered, in which case a good old value assignment takes place [7] [8]. That’s called value semantics, although I would prefer the term state semantics: objects do not have a value, they have state, and that’s what’s being transferred here.

Few people would disagree that assignment between variables is part of the design of a language. In C++, assignment between variables of user-defined type involves making a copy of an object. But experience shows that making a copy of an object is hard. The following excerpt from the comp.lang.lisp FAQ page sums it up nicely: “Q: Why isn’t there a DEEP-COPY function in the language? A: Copying an arbitrary structure or object [sic!] needs context to determine what the correct copy is.” In C++, this is exacerbated by the fact that making a copy of an object encounters technical problems such as dealing with C-style pointers as class members. [9] I believe that this is evidence to support my conjecture 1 above: the C++ programmer has to spend effort to ensure that a language feature as basic as assignment between variables works properly. This is just one example; I am prepared to give more.

For evidence supporting my conjecture 2 above, I ask you to consider rvalue references. Rvalue references provide move semantics, and they solve the perfect forwarding problem. Neither the performance issue that move semantics address nor the perfect forwarding problem exist in classic OO languages that use reference semantics and garbage collection for user-defined types. C++ is solving its own specific problems here, and it does so in a way that is not transparent to the application programmer: rvalue references have increased the surface area of the language as seen by the application programmer. Again, this is merely one example of evidence for my conjecture; I am prepared to give more.

It has been said that there is such a thing as the DNA of a company or organization. If it is true that programming languages have DNAs as well, then it appears to me that C++ has built into its DNA

a willingness to let issues that are rooted in the internals of the language and its design spill into programmers’ everyday lives,

a tendency towards experimentation, a tendency to focus on the immediate benefit of a feature and fix problems later, and

a tendency to increase the volume and surface area of the language lightly, with little regard for the combinatorial complexity that arises from the interaction of features.

Again, not to make excuses for my inexcusable behavior of writing complicated and experimental code out of vanity, but I honestly feel that I have been tempted to do so by a culture that was indifferent to gratuitous complexity at best, encouraging it at worst. I believe that the stewards of a language that demands so much from its users, and gives them so much opportunity to go overboard with experimentation and complexity, have an obligation to make a strong effort at creating a culture of simplicity. To be perfectly honest, I don’t see that effort at all. In view of the competition from languages like Go, this makes me pessimistic about the future of C++.

Notes

[1] I can’t say that I have a particular preference for Java, certainly not for its heavy emphasis on OO. But to my surprise, I found that Java does what I now treasure most about a programming language: it gets out of my way.

[2] If there is one thing you take away from this blog post, let it be a motivation to read up on the history of object-oriented programming. The Wikipedia article is a good place to start.

[3] The acronym BCLP has been facetiously interpreted as “Before C Programming Language,” because BCLP was a precursor to C.

[4] Wikipedia does not give a source for this. Two commenters on this blog post have given two different sources where Bjarne Stroustrup made statements to this effect: Artashes Aghajanyan in the comments on this page, and claystu in the Hacker News discussion

[5] In the unlikely event that I ever get on NPR’s “This I Believe,” I’ll say, “I believe in respect for working code and for software that makes peoples lives better.”

[6] It is perhaps noteworthy that Simula 67 had two notations for assignment, namely, := for value assignment of basic types and :- for reference assignment of user-defined types.

[7] The recursive nature of the copy assignment operator implies that in order to understand what the line x=y; does, we have to recursively inspect the source code of the types of all members to see if any of the copy assignment operators are overloaded.

[8] Actually, there is another option for implementing the copy assignment operator to get value semantics: one could call the destructor on the object on the left and then copy-construct the object on the right into the space formerly occupied by the object on the left. The so-called copy-and-swap idiom imitates that behavior.

[9] Anybody who has been around C++ since the mid-1990’s will have to admit that the amount of time and resources spent on discussing the C++ copy assignment operator, and the problems that it has caused in code, are absolutely staggering. I’ve heard people say that this has gotten better because of a new best practice recommendation to shun C-style pointers altogether and use shared_ptr instead, as the latter plays well with the assignment’s default behavior of memberwise copying. If this is indeed the new best practice, then we have the following situaion: we set out to decouple the OO paradigm from GC-supported reference semantics and combine it with the low-level efficiency of C. Instead, we’re now using shared_ptr, which gives us reference semantics supported by per-object reference counting. Compared to GC-supported reference semantics, reference semantics supported by per-object reference counting is

not faster (the details of the performance comparisons are gory, but by no stretch of the imagination is per-object reference counting the faster alternative),

more brittle (the programmer has to watch out for cyclic references and handle them via weak pointers, see also my conjecture 1 above), and

less powerful (the participants in cyclic references cannot be peers as is the case under a garbage collector, there has to be an ownership relation).