Stepanov-Regularity and Partially-Formed Objects vs. C++ Value Types

In this article, I will take a look at one of the fundamental concepts introduced in Alex Stepanov and Paul McJones’ seminal book “Elements of Programming” (EoP for short) — that of a (Semi-)Regular Type and Partially-Formed State.

Using these, I shall try to derive rules for C++ implementations of what are commonly called “value types”, focusing on the bare essentials, as I feel they have not been addressed in sufficient depth up to now: Special Member Functions.

Alex Stepanov and Paul McJones gave us a whole new way of looking at this, with a mathematical theory of types and algorithms quite unlike anything ever done before. Their achievement will forever change the way you look at computer programming, but eight years after its publication, the book still does not get the widespread adoption it deserves.

Setting The Stage

Special Member Functions, of course, are those member functions of a C++ object that the compiler can write for you: The default constructor, the copy and move constructors, the copy and move assignment operators and the destructor.

A Regular Type in EoP roughly corresponds to the EqualityComparable combined with the CopyConstructible C++ concept, see the book for more details.

A C++ Value Type is a type that is defined by its state, and its state alone (note that EoP has a very different definition of value type). Take an int as an example. Two int objects of value 5 will behave identical under all regular operations (simplified: all operations except for taking the object’s address). Two Shape objects, however, both having the same position, color, texture, … still may end up a square and a triangle when drawn on screen. A Shape object is defined by its behaviour as much as its state. We call such types polymorphic.

There are many shades of grey in between those two extremes; let’s leave it at that crude distinction. See Designing value classes for modern C++ – Marc Mutz @ Meeting C++ 2014 for a somewhat more thorough treatment.

In this article, we will look at two different classes, Rect and Pen , and try to write their Special Member Functions hopefully as Stepanov would have us do.

Rect and Pen

The first, Rect , is simple: it’s an integral-coordinate rectangle class that we will define completely inline in the header file. Pen , however, will be quite a bit different: It will use the Pimpl Idiom to firewall its internals from users. See Pimp My Pimpl and Pimp My Pimpl — Reloaded for more on the idiom.

class Rect { int x1, y1, x2, y2; public: }; class Pen { class Private; // defined out-of-line Private *d; public: };

The first task for today is to write the default constructor.

Default Construction

EoP has this to say about the default constructor:

[It] takes no arguments and leaves the object in a partially-formed state.

Ok, so what’s a “partially-formed state”? Here comes the good part:

An object is in a partially-formed state if it can be assigned-to or destroyed.

The authors go on to say that any other operation on partially-formed objects is undefined. In particular, such objects do not, in general, represent a valid value of the type.

The motivation for EoP to require default-construction in the first place is programmer convenience: T a = b; should be equivalent to T a; a = b; , and the user of the type should get to choose whether to write

T a; if (cond) a = b; else a = c;

or

T a = (cond) ? b : c;

Without default construction, if all the type’s author gave are user-defined constructors that establish a valid value, the programmer would have to use the ternary operator, whether or not that fits with line length limitations and personal preferences.

The comments at the end of the article contain even more reasons to support default construction.

A default constructor for Rect

So, let’s try write something for Rect :

class Rect { int x1, y1, x2, y2; public: Rect() = default; };

What do you think? Would you have written the Rect default constructor this way?

I can tell you I wouldn’t have. Not until EoP opened my eyes. Remember that EoP only requires that the default constructor establish a partially-formed state, not a valid value. This should not surprise you. When in C++, do as the int s do:

int x; Rect r;

In both cases, any use of the default-constructed object other than assignment or destruction is undefined, because the values of the objects are undefined (uninitialised).

If you feel uncomfortable with this implementation, you’re letting your inner Java programmer get the better of you. Don’t. This is C++. We embrace the undefined.

And, as Howard Hinnant writes in a reddit comment on this article, we give power to our users:

int x = {}; // x == 0 Rect r = {}; // r == {0, 0, 0, 0}

Next, let’s try Pen .

A default constructor for Pen

class Pen { class Private; // defined out-of-line Private *d; public: Pen() : d(nullptr) {} // inline ~Pen() { delete d; } // out-of-line };

Should we have left Pen::d uninitialised, too?

No. Doing so would make destruction undefined.

Should we have new ed a Pen::Private object into Pen::d in the default constructor?

That would be a no, too. We’re not required to establish a valid value in the default constructor, so in the spirit of “don’t pay for what you don’t use”, we only do the minimal work necessary to establish a partially-formed state.

To hammer this one home: Should an implementation of

Colour Pen::colour() const;

check for d == nullptr ?

No the third. You can see at a glance in the source code whether an object is in a partially-formed state. There is no need for a runtime check, except for debugging purposes.

From the above, it follows that your default constructors should be noexcept. If your default constructors throw, they do too much. Of course, we’re still talking Value Types here, so let no man say that yours truly told you to make the default constructors of your RAII types noexcept.

Move-Construction And Move-Assignment

For Rect , moving and copying are the same thing, and the compiler is in the best position to implement them for you:

class Rect { int x1, y1, x2, y2; public: Rect() = default; // compiler-generated copy/move special member functions are ok! };

Once more, Pen is a bit more interesting:

class Pen { class Private; // defined out-of-line Private *d; public: Pen() noexcept : d(nullptr) {} // inline Pen(Pen &&other) noexcept : d(other.d) { other.d = nullptr; } // inline ~Pen() { delete d; } // out-of-line };

We put moved-from Pen objects into the partially-formed state. In other words: moving from an object has the same effect as default-construction. Can it get any simpler?

We delegate move-assignment to the move constructor:

class Pen { class Private; // defined out-of-line Private *d; public: Pen() noexcept : d(nullptr) {} // inline Pen(Pen &&other) noexcept : d(other.d) { other.d = nullptr; } // inline Pen &operator=(Pen &&other) noexcept // inline { Pen moved(std::move(other)); swap(moved): return *this; } ~Pen() { delete d; } // out-of-line void swap(Pen &other) noexcept { using std::swap; swap(d, other.d); } };

Note how all special member functions except the destructor are inline so far, yet we didn’t break encapsulation of the Pen::Private class.

Controversy

Thanks in no small part to the ISO C++ standard, which describes moved-from objects (in [lib.types.movedfrom]) as follows:

Objects of types defined in the C++ standard library may be moved from. Move operations may be explicitly specified or implicitly generated. Unless otherwise specified, such moved-from objects shall be placed in a valid but unspecified state.

the simple chain of reasoning described so far has less friends than you might think. And this is why I wrote this article.

You will probably meet a lot of resistance when trying to implement your default and move constructors this way. But think about it: What would a natural “default value” of your type be?

It’s easy to fall for the next-best choice: For int , surely the default-constructed value should be zero, and we just have to put up with this partially-formed, nay: uninitialised, values because C sucks.

I disagree. If you are using the int additively, then, yes, zero is a good default value. But if you work with multiplication, then one would be the better fit.

Bottomline: for the vast majority of types, there is no natural default. If there isn’t, then having to establish a randomly-chosen one on every default-construction operation is wasteful, so don’t do it.

Instead, have the default constructor establish only a partially-formed state, and provide literals (or named factory functions for something more complex) for the different “default” values:

class Rect { static constexpr Rect emptyRect = {}; }; class Pen { static Pen none(); static Pen solidBlackCosmetic(); };

Embracing Partially-Formed Objects

Partially-Formed Objects are nothing magical. They offer a simple description of the behaviour of C++ built-in types with respect to default construction, and of pimpl’ed objects with respect to move semantics, if implemented in the natural way.

In both cases, partially-formed objects are easily spotted in source code with local static reasoning, so demands for anything more fancy than the bare minimum as the result of moving from an object or default-constructing one are violating the C++ principle of “don’t pay for what you don’t use”. As a corollary, keep your default constructors noexcept .

In a future instalment, we will look at a smart pointer that encodes these guidelines for use as a pimpl-pointer.