Why I Dislike C++ For Large Projects

By Mark Roulo

12-June-2001

In a large program with raw pointers, the quality of the program will largely be driven by the least talented members of the team. This makes it highly dangerous to select a language that requires all the programmers to be in, say, the top 25% of the talent pool. Given a team with 10 developers (and my projects typically have more like 40-50), this seems to be asking for lots of long term trouble.

Things become even more unstable when one considers that the average turnover of software developers in Silicon Valley (where I live and work) is something like 20-25% and that many large projects also use contractors. The total number of programmers on a project will be much higher than the average (or even the peak) number of developers on the project.

I have worked on a project with both C++ and Java components (roughly 50% each) that communicate via CORBA. Part of my job has been to interview candidates for both halves of the project over a period of several years. This has given me a fair amount of exposure to the C++ developer pool.

As part of my standard interview for C++ candidates I ask them to write me a small class with the intention of evaluating their command of the language. This also gives us a reasonable coding sample to discuss during the interview. I can ask about potential improvements, extensions and testing strategies.

Most of the candidates I interview have already made it to the top of the resume pool -- usually by claiming at least 3 years professional experience with C++. Since many resumes have this, the candidates tend to have some other plus: large systems experience, degree from a good school, personal recommendation, etc.

The candidates then must survive a phone screen interview whose job is to weed out candidates that can't, for example, describe any of their projects coherently.

My request is to:

Write a Named Point class with three members: two floating point values for the coordinates on an X-Y plane, and a name represented as a 'char *'. Assume that this class will be used for some sort of wargame or simulation program that treats the world as flat and that these named points will be used to represent things like cities, battlefields, etc.

A typical first try looks something like this:

class NamedPoint { private: float x; float y; char *name; public: NamedPoint (float x, float y, char *name) { this->x = x; this->y = y; this->name = name; } float getX() {return x;} float getY() {return y;} char *getName() {return name;} void setX(float x) {this->x = x;} void setY(float y) {this->y = y;} void setName(char *name) {this->name = name;} };

name has its encapsulation violated with the getName() method.

has its encapsulation violated with the method. The code calling the constructor is responsible for managing the scope of the member variable 'name'. This code fragment shows the problem:

// Ignore for now a lot of awfulness in this function ... // this should probably be a constructor in a sub-class // of NamedPoint, 'cityName' and 'countryName' should be // checked for NULL _and_ for length so that sprintf() // doesn't overrun temp ... // // The point is that if NamedPoint doesn't *own* its own // 'name' value, the clients are at risk of memory corruption. // NamedPoint makeCityCoordinate (float x, float y, char *cityName, char *countryName) { char temp[80]; sprintf (temp, "City: %s, Country: %s", cityName, countryName); return NamedPoint (x, y, temp); }

class NamedPoint { private: float x; float y; char *name; public: NamedPoint (float x, float y, char *name) { this->x = x; this->y = y; this->name = new char[strlen(name) + 1]; strcpy (this->name, name); } float getX() {return x;} float getY() {return y;} const char *getName() {return name;} void setX(float x) {this->x = x;} void setY(float y) {this->y = y;} void setName(char *name) {this->name = new char[strlen(name) + 1]; strcpy (this->name, name);} };

It doesn't have a destructor, so it leaks memory.

setName() doesn't delete name , so it leaks more memory if setName() is called.

doesn't delete , so it leaks more memory if is called. strlen(NULL) and strcpy(NULL) are bad. Usually, a program will crash if this is attempted, so we really should check for NULL .

class NamedPoint { private: float x; float y; char *name; public: NamedPoint (float x, float y, char *name) { this->x = x; this->y = y; if (name == NULL) this->name = NULL; else { this->name = new char[strlen(name) + 1]; strcpy (this->name, name); } } ~NamedPoint () { if (name != NULL) delete name; } float getX() {return x;} float getY() {return y;} const char *getName() {return name;} void setX(float x) {this->x = x;} void setY(float y) {this->y = y;} void setName(char *name) {if (this->name != NULL) delete this->name; if (name == NULL) this->name = NULL; else { this->name = new char[strlen(name) + 1]; strcpy (this->name, name); }} };

NamedPoint allocates with 'new[]' but deletes with 'delete'. This may or may not work depending on the compiler. It seems to work fine for most current compilers, so I rarely comment on this. Still, it is incorrect.

Testing for NULL before calling delete is unnecessary (since 'delete 0' is defined to be harmless), but causes no damage other than slowing down the program slightly.

NamedPoint now trashes the heap if any NamedPoint objects are passed by value (like, for example, returning a NamedPoint object from a function). This is because the copy constructor that C++ gives us for free copies the 'name' pointer, but does not copy the contents. Now, calling the destructor on the first shared 'name' returns the memory to the heap (although the second copy will continue to use it, EVEN IF THE MEMORY GETS ALLOCATED TO SOME OTHER USE). Calling the destructor on the second shared 'name' probably corrupts the heap by deleting memory that was not, at that time, allocated (the second delete isn't required to corrupt the heap, but this is how most C++ heap managers work).

It has similar problems with the default assignment operator.

class NamedPoint { private: float x; float y; char *name; public: NamedPoint (float x, float y, char *name) { this->x = x; this->y = y; if (name == NULL) this->name = NULL; else { this->name = new char[strlen(name) + 1]; strcpy (this->name, name); } } ~NamedPoint () { if (name != NULL) delete name; } // NOTE: Most interviewees start with a signature // like this: // NamedPoint (NamedPoint copy) // NamedPoint (const NamedPoint & copy) { this->x = copy.x; this->y = copy.y; if (copy.name != NULL) { this->name = new char[strlen (copy.name) + 1]; strcpy (this->name, copy.name); } } NamedPoint & operator=(const NamedPoint & copy) { this->x = copy.x; this->y = copy.y; if (this->name != NULL) delete this->name; if (copy.name != NULL) { this->name = new char[strlen (copy.name) + 1]; strcpy (this->name, copy.name); } // Note that we haven't nulled out this->name, so // we can get a double-delete problem... } float getX() {return x;} float getY() {return y;} const char *getName() {return name;} void setX(float x) {this->x = x;} void setY(float y) {this->y = y;} void setName(char *name) {if (this->name != NULL) delete this->name; if (name == NULL) this->name = NULL; else { this->name = new char[strlen(name) + 1]; strcpy (this->name, name); }} };

I usually stop here (assuming we get this far).

Conclusion

Pointer assignment (a C legacy) makes it too easy to corrupt the stack and heap. The initial solution allows the stack to be accessed after it has gone out of scope. Corrected versions often allow for double deletes of heap allocated storage or accessing already deleted heap storage or both.

The default copy constructor and assignment operator are too often wrong. But you get them unless you explicitly take action. The language default being fatally wrong is a big problem.

delete and delete[] are similar, but possibly different.

NULL is legal for many pointer values, but the behavior tends to be undefined (delete being one nice exception). Since the NULL case is frequently overlooked, memory corruption again seems to be designed in to large systems.

Larger programs encounter even more tricky problems. Scott Meyers has written two book on the subject of not getting killed by the C++ language. My point, though, is that most experienced C++ programmers I have interviewed can't get a simple class correct, even after multiple tries. Enough said. It makes me unwilling to risk a large project with the language.