Getting the Benefits of Strong Typing in C++ at a Fraction of the Cost

Guest writer Vincent Zalzal talks to us about lightweight strong types. Vincent is a software developer working in the computer vision industry for the last 12 years. He appreciates all the levels of complexity involved in software development, from how to optimize memory cache accesses to devising algorithms and heuristics to solve complex applications, all the way to developing stable and user-friendly frameworks. You can find him online on Twitter or LinkedIn.

Strong types promote safer and more expressive code. I won’t repeat what Jonathan has presented already in his series on strong types.

I suspect some people may find that the NamedType class template has a nice interface but is using a somewhat heavy machinery to achieve the modest goal of strong typing. For those people, I have good news: you can achieve many of the functionalities of NamedType , with a very simple tool. That tool is the humble struct.

Struct as strong type

Let’s look at a simplified version of NamedType , without Skills:

template <typename T, typename Parameter> class NamedType { public: explicit NamedType(T const& value) : value_(value) {} template<typename T_ = T, typename = IsNotReference<T_>> explicit NamedType(T&& value) : value_(std::move(value)) {} T& get() { return value_; } T const& get() const {return value_; } private: T value_; }; 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 template < typename T , typename Parameter > class NamedType { public : explicit NamedType ( T const & value ) : value_ ( value ) { } template < typename T_ = T , typename = IsNotReference < T_ >> explicit NamedType ( T && value ) : value_ ( std :: move ( value ) ) { } T & get ( ) { return value_ ; } T const & get ( ) const { return value_ ; } private : T value_ ; } ;

This class is hiding the underlying value, and giving access to it with get() . There seems to be no set() method, but it is still there, hidden in the get() function. Indeed, since the get() function returns a non-const reference, we can do:

using Width = NamedType<double, struct WidthTag>; Width width(42); width.get() = 1337; 1 2 3 using Width = NamedType < double , struct WidthTag > ; Width width ( 42 ) ; width . get ( ) = 1337 ;

Since the get() method is not enforcing any invariant and the underlying value is accessible, it is essentially public. Let’s make it public then! By doing so, we get rid of the get() functions. Also, since everything in the class is public, and since, semantically, it is not enforcing any invariant, let’s use a struct instead:

template <typename T, typename Parameter> struct NamedType { explicit NamedType(T const& value) : value_(value) {} template<typename T_ = T, typename = IsNotReference<T_>> explicit NamedType(T&& value) : value_(std::move(value)) {} T value_; }; 1 2 3 4 5 6 7 8 9 10 template < typename T , typename Parameter > struct NamedType { explicit NamedType ( T const & value ) : value_ ( value ) { } template < typename T_ = T , typename = IsNotReference < T_ >> explicit NamedType ( T && value ) : value_ ( std :: move ( value ) ) { } T value_ ; } ;

But wait: do we really need those explicit constructors? If we remove them, we can use aggregate initialization, which performs exactly the same thing. We end up with:

template <typename T, typename Parameter> struct NamedType { T value_; }; 1 2 3 4 5 template < typename T , typename Parameter > struct NamedType { T value_ ; } ;

That struct is not reusing code anymore. So the last simplification is to use a non-template struct directly to define the strong type.

struct Width { double v; }; 1 struct Width { double v ; } ;

There you have it: a strong type, without heavy machinery. Want to see it in action?

struct Width { double v; }; struct Height { double v; }; class Rectangle { /* ... */ }; Rectangle make_rect(Width width, Height height) { return Rectangle(/* ... */); } Rectangle make_square(Width width) { return Rectangle(/* ... */); } void foo() { // Aggregate initialization copies lvalues and moves rvalues. Width width {42.0}; // constexpr also works. constexpr Width piWidth {3.1416}; // get() and set() are free. // set() copies lvalues and moves rvalues. double d = width.v; width.v = 1337.0; // Copy and move constructors are free. Width w1 {width}; Width w2 {std::move(w1)}; // Copy and move assignment operators are free. w1 = width; w2 = std::move(w1); // Call site is expressive and type-safe. auto rect = make_rect(Width{1.618}, Height{1.0}); // make_rect(Height{1.0}, Width{1.618}); does not compile // Implicit conversions are disabled by default. // make_rect(1.618, 1.0); does not compile // double d1 = w1; does not compile // Call site can also be terse, if desired (not as type-safe though). auto square = make_square( {2.718} ); } 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 struct Width { double v ; } ; struct Height { double v ; } ; class Rectangle { /* ... */ } ; Rectangle make_rect ( Width width , Height height ) { return Rectangle ( /* ... */ ) ; } Rectangle make_square ( Width width ) { return Rectangle ( /* ... */ ) ; } void foo ( ) { // Aggregate initialization copies lvalues and moves rvalues. Width width { 42.0 } ; // constexpr also works. constexpr Width piWidth { 3.1416 } ; // get() and set() are free. // set() copies lvalues and moves rvalues. double d = width . v ; width . v = 1337.0 ; // Copy and move constructors are free. Width w1 { width } ; Width w2 { std :: move ( w1 ) } ; // Copy and move assignment operators are free. w1 = width ; w2 = std :: move ( w1 ) ; // Call site is expressive and type-safe. auto rect = make_rect ( Width { 1.618 } , Height { 1.0 } ) ; // make_rect(Height{1.0}, Width{1.618}); does not compile // Implicit conversions are disabled by default. // make_rect(1.618, 1.0); does not compile // double d1 = w1; does not compile // Call site can also be terse, if desired (not as type-safe though). auto square = make_square ( { 2.718 } ) ; }

This code looks a lot like the one you would get using NamedType (except for the last line that would be prevented by the explicit constructor). Here are some added benefits of using structs as strong types:

more readable stack traces ( NamedType can generate pretty verbose names)

can generate pretty verbose names) code easier to understand for novice C++ developers and thus easier to adopt in a company

one fewer external dependency

I like the convention of using v for the underlying value, because it mimics what the standard uses for variable templates, like std::is_arithmetic_v or std::is_const_v . Naturally, you can use whatever you find best, like val or value . Another nice convention is to use the underlying type as name:

struct Width { double asDouble; }; void foo() { Width width {42}; auto d = width.asDouble; } 1 2 3 4 5 6 7 struct Width { double asDouble ; } ; void foo ( ) { Width width { 42 } ; auto d = width . asDouble ; }

Skills

Using the struct as presented above requires accessing the underlying member directly. Often, few operations on the struct are necessary, and direct access to the underlying member can be hidden in member functions of the class using the strong type. However, in other cases where arithmetic operations are necessary, for example, in the case of a width, then skills are needed to avoid having to implement operators again and again.

The inheritance approach used by NamedType or boost::operators works well. I do not claim that the method I will present here is elegant, but it is an alternative to using inheritance that has advantages, notably simplicity.

Operator Overloading

First, note that almost all operators in C++ can be implemented as non-member functions. Here are the operators that cannot be implemented as non-member functions:

assignment, i.e. operator= (in our case, the implicitly-generated version is okay)

(in our case, the implicitly-generated version is okay) function call, i.e. operator()

subscripting, i.e. operator[]

class member access, i.e. operator->

conversion functions, e.g. operator int()

allocation and deallocation functions ( new , new[] , delete , delete[] )

All other overloadable operators can be implemented as non-member functions. As a refresher, here they are:

– unary: + - * & ~ ! ++ (pre and post) -- (pre and post)

– binary: + - * / % ^ & | < > += -= *= /= %= ^= &= |= << >> >>= <<= == != <= >= && || , ->*

As an example, for the Width type above, the less-than operator would look like this:

inline bool operator<(Width lhs, Width rhs) { return lhs.v < rhs.v; } 1 2 3 4 inline bool operator < ( Width lhs , Width rhs ) { return lhs . v < rhs . v ; }

As a side note, I chose to pass the widths by value in the code above for performance reasons. Given their small size, those structs are typically passed in directly in registers, like arithmetic types. The optimizer will also optimize the copy away since it is working mostly on arithmetic types here. Finally, for binary operations, further optimizations are sometimes possible because the compiler knows for certain there is no aliasing, i.e. the two operands do not share the same memory. For bigger structs (my personal threshold is more than 8 bytes) or structs with non-trivial constructors, I would pass the parameters by const lvalue reference.

All other relational operators would have to be defined similarly. To avoid repeating that code over and over again for each strong type, we must find a way to generate that code.

The Inheritance Approach

NamedType uses inheritance and CRTP as code generator. It has the advantage of being part of the language. However, it pollutes the type name, especially when looking at a call stack. For example, the function:

using NT_Int32 = fluent::NamedType<int32_t, struct Int32, fluent::Addable>; void vectorAddNT(NT_Int32* dst, const NT_Int32* src1, const NT_Int32* src2, int N); 1 2 using NT_Int32 = fluent :: NamedType < int32_t , struct Int32 , fluent :: Addable > ; void vectorAddNT ( NT_Int32 * dst , const NT_Int32 * src1 , const NT_Int32 * src2 , int N ) ;

results in the following line in the call stack:

vectorAddNT(fluent::NamedType<int,Int32,fluent::Addable> * dst, const fluent::NamedType<int,Int32,fluent::Addable> * src1, const fluent::NamedType<int,Int32,fluent::Addable> * src2, int N) 1 vectorAddNT ( fluent :: NamedType < int , Int32 , fluent :: Addable > * dst , const fluent :: NamedType < int , Int32 , fluent :: Addable > * src1 , const fluent :: NamedType < int , Int32 , fluent :: Addable > * src2 , int N )

This is for one skill; the problem gets worse the more skills are added.

The Preprocessor Approach

The oldest code generator would be the preprocessor. Macros could be used to generate the operator code. But code in macros is rarely a good option, because macros can’t be stepped into while debugging.

Another way to use the preprocessor as code generator is to use include files. Breakpoints can be set in included files without problem, and they can be stepped into. Unfortunately, to pass parameters to the code generator, we must resort to using define directives, but it is a small price to pay.

struct Width { double v; }; #define UTIL_OP_TYPE_T_ Width #include <util/operators/less_than_comparable.hxx> #undef UTIL_OP_TYPE_T_ 1 2 3 4 5 struct Width { double v ; } ; #define UTIL_OP_TYPE_T_ Width #include <util/operators/less_than_comparable.hxx> #undef UTIL_OP_TYPE_T_

The file less_than_comparable.hxx would look like this:

inline bool operator<(UTIL_OP_TYPE_T_ lhs, UTIL_OP_TYPE_T_ rhs) { return lhs.v < rhs.v; } inline bool operator>(UTIL_OP_TYPE_T_ lhs, UTIL_OP_TYPE_T_ rhs) { return lhs.v > rhs.v; } // ... 1 2 3 4 5 6 7 8 9 inline bool operator < ( UTIL_OP_TYPE_T_ lhs , UTIL_OP_TYPE_T_ rhs ) { return lhs . v < rhs . v ; } inline bool operator > ( UTIL_OP_TYPE_T_ lhs , UTIL_OP_TYPE_T_ rhs ) { return lhs . v > rhs . v ; } // ...

It is a good idea to use a different extension than usual for files included in this way. These are not normal headers; for example, header guards must absolutely not be used in them. The extension .hxx is less frequently used, but it is recognized as C++ code by most editors, so it can be a good choice.

To support other operators, you simply include multiple files. It is possible (and desirable) to create a hierarchy of operators, as is done in boost::operators (where the name less_than_comparable comes from). For example, the skills addable and subtractable could be grouped under the name additive .

struct Width { double v; }; #define UTIL_OP_TYPE_T_ Width #include <util/operators/additive.hxx> #include <util/operators/less_than_comparable.hxx> // ... #undef UTIL_OP_TYPE_T_ // util/operators/additive.hxx #include <util/operators/addable.hxx> #include <util/operators/subtractable.hxx> // util/operators/addable.hxx inline UTIL_OP_TYPE_T_ operator+(UTIL_OP_TYPE_T_ lhs, UTIL_OP_TYPE_T_ rhs) { return {lhs.v + rhs.v}; } inline UTIL_OP_TYPE_T_& operator+=(UTIL_OP_TYPE_T_& lhs, UTIL_OP_TYPE_T_ rhs) { lhs.v += rhs.v; return lhs; } // etc 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 struct Width { double v ; } ; #define UTIL_OP_TYPE_T_ Width #include <util/operators/additive.hxx> #include <util/operators/less_than_comparable.hxx> // ... #undef UTIL_OP_TYPE_T_ // util/operators/additive.hxx #include <util/operators/addable.hxx> #include <util/operators/subtractable.hxx> // util/operators/addable.hxx inline UTIL_OP_TYPE_T_ operator + ( UTIL_OP_TYPE_T_ lhs , UTIL_OP_TYPE_T_ rhs ) { return { lhs . v + rhs . v } ; } inline UTIL_OP_TYPE_T_ & operator += ( UTIL_OP_TYPE_T_ & lhs , UTIL_OP_TYPE_T_ rhs ) { lhs . v += rhs . v ; return lhs ; } // etc

It may come as a surprise that operator+= can be implemented as a non-member function. I think it highlights the fact that the struct is seen as data, not as object. It has no member function in itself. However, as mentioned above, there are a few operators that cannot be implemented as non-member functions, notably, operator-> .

I would argue that if you need to overload those operators, the strong type is not semantically a struct anymore, and you would be better off using NamedType .

However, nothing prevents you from including files inside the struct definition, even if a few people may cringe when seeing this:

#define UTIL_OP_TYPE_T_ WidgetPtr struct WidgetPtr { std::unique_ptr<Widget> v; #include <util/operators/dereferenceable.hxx> }; #undef UTIL_OP_TYPE_T_ 1 2 3 4 5 6 7 #define UTIL_OP_TYPE_T_ WidgetPtr struct WidgetPtr { std :: unique_ptr < Widget > v ; #include <util/operators/dereferenceable.hxx> } ; #undef UTIL_OP_TYPE_T_

The Code Generator Approach

Big companies like Google rely more and more on bots to generate code (see protobuf) and commits (see this presentation). The obvious drawback of the method is that you need an external tool (like Cog for example) integrated into the build system to generate the code. However, once the code is generated, it is very straightforward to read and use (and also to analyze and compile). Since each strong type has its own generated copy, it is also easier to set a breakpoint in a function for a specific type.

Using a tool to generate code can lead to an elegant pseudo-language of keywords added to the language. This is the approach taken by Qt, and they defend it well (see Why Does Qt Use Moc for Signals and Slots?)

Skills for enums

Skills can also be useful on enums to implement bit flags. As a side note, the inheritance approach cannot be applied to enums, since they can’t inherit functionality. However, strategies based on non-member functions can be used in that case. Bit flags are an interesting use-case that deserve an article of their own.

Performance

As Jonathan already stated, NamedType is a zero-cost abstraction: given a sufficient level of optimisation (typically O1 or O2), compilers emit the same code as if arithmetic types were used directly. This also holds for using a struct as strong type. However, I wanted to test if compilers were also able to vectorize the code correctly when using NamedType or a struct instead of arithmetic types.

I compiled the following code on Visual Studio 2017 (version 15.5.7) with default release options in both 32-bit and 64-bit configurations. I used godbolt to test GCC 7.3 and Clang 5.0 in 64-bit, using the -O3 optimization flag.

using NT_Int32 = fluent::NamedType<int32_t, struct Int32, fluent::Addable>; struct S_Int32 { int32_t v; }; S_Int32 operator+(S_Int32 lhs, S_Int32 rhs) { return { lhs.v + rhs.v }; } void vectorAddNT(NT_Int32* dst, const NT_Int32* src1, const NT_Int32* src2, int N) { for (int i = 0; i < N; ++i) dst[i] = src1[i] + src2[i]; } void vectorAddS(S_Int32* dst, const S_Int32* src1, const S_Int32* src2, int N) { for (int i = 0; i < N; ++i) dst[i] = src1[i] + src2[i]; } void vectorAddi32(int32_t* dst, const int32_t* src1, const int32_t* src2, int N) { for (int i = 0; i < N; ++i) dst[i] = src1[i] + src2[i]; } 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 using NT_Int32 = fluent :: NamedType < int32_t , struct Int32 , fluent :: Addable > ; struct S_Int32 { int32 _ t v ; } ; S_Int32 operator + ( S_Int32 lhs , S_Int32 rhs ) { return { lhs . v + rhs . v } ; } void vectorAddNT ( NT_Int32 * dst , const NT_Int32 * src1 , const NT_Int32 * src2 , int N ) { for ( int i = 0 ; i < N ; ++ i ) dst [ i ] = src1 [ i ] + src2 [ i ] ; } void vectorAddS ( S_Int32 * dst , const S_Int32 * src1 , const S_Int32 * src2 , int N ) { for ( int i = 0 ; i < N ; ++ i ) dst [ i ] = src1 [ i ] + src2 [ i ] ; } void vectorAddi32 ( int32_t * dst , const int32_t * src1 , const int32_t * src2 , int N ) { for ( int i = 0 ; i < N ; ++ i ) dst [ i ] = src1 [ i ] + src2 [ i ] ; }

Under Clang and GCC, all is well: the generated code is the same for all three functions, and SSE2 instructions are used to load, add and store the integers.

Unfortunately, results under VS2017 are less than stellar. Whereas the generated code for arithmetic types and structs both use SSE2 instructions, NamedType seems to inhibit vectorization. The same behavior can be observed if get() is used directly instead of using the Addable skill. This is something to keep in mind when using NamedType with big arrays of data.

VS2017 also disappoints in an unexpected way. The size of NT_Int32 is 4 bytes on all platforms, with all compilers, as it should be. However, as soon as a second skill is added to the NamedType , for example Subtractable , the size of the type becomes 8 bytes! This is also true for other arithmetic types. Replacing int32_t in the NamedType alias with double yields a size of 8 bytes for one skill, but 16 bytes as soon as a second skill is added.

Is it a missing empty base class optimization in VS2017? Such a pessimization yields memory-inefficient, cache-unfriendly code. Let’s hope future versions of VS2017 will fare better.

EDIT: As redditer fernzeit pointed out, the empty base class optimization is disabled by default when using multiple inheritance on Visual Studio. When using the __declspec(empty_bases) attribute, Visual Studio generates the same class layout as Clang and GCC. The attribute has been added to the NamedType implementation to fix the issue.

Compilation time

A criticism often formulated against templates is that they tend to slow compilation down. Could it affect NamedType ? On the other hand, since all the code for NamedType is considered external to a project, it can be added to a precompiled header, which means it will be read from disk and parsed only once.

Using a struct as strong type with include files for skills does not incur the template penalty, but requires reading from disk and parsing the skill files again and again. Precompiled headers cannot be used for the skill files, because they change each time they are included. However, the struct can be forward declared, a nice compilation firewall that NamedType cannot use, since type aliases cannot be forward declared.

To test compilation time, I created a project with 8 strong types, each contained in its own header file, and 8 simple algorithms, each using one strong type and having both a header file and an implementation file. A main file then includes all the algorithm headers, instantiates the strong types and call the functions one at a time.

Compilation time has been measured in Visual Studio 2017 (version 15.5.7) using the very useful VSColorOutput extension (check it out!). Default compilation options for a Windows console application were used. For every configuration, 5 consecutive compilations have been performed and the median time computed. Consequently, these are not “cold” times, caching affects the results.

Two scenarios have been considered: the full rebuild, typical of build machines, and the single-file incremental build, typical of the inner development loop.

32-bit and 64-bit configurations yielded no significant difference in compilation time, so the average of the two is reported below. This is also the case for debug and release configurations (unless otherwise stated). All times are in seconds, with a variability of about ± 0.1s.

A first look at the results in Table 1 could lead to hasty conclusions. NamedType appears slower, but its compilation time can be greatly reduced with the use of precompiled headers. Also, the other strategies have an unfair advantage: they don’t include any standard headers. NamedType includes four of them: type_traits , functional , memory and iostream (mostly to implement the various skills). In most real-life projects, those headers would also be included, probably in precompiled headers to avoid slowing down compilation time.

Also it’s worth noting that NamedType currently brings in all skills in the same header. Presumably, including skill headers on demand could decrease compilation time in some applications.

To get a fairer picture, precompiled headers have been used to generate the results in Table 2 below:

Ah, much nicer! It is hazardous to extrapolate these results to bigger, real-life projects, but they are encouraging and support the idea that strong typing is a zero-cost abstraction, with negligible impact on compilation time.

Conclusion

My goal is not to convince you that using structs as strong types is better than using NamedType . Rather, strong typing is so useful that you should have alternatives if NamedType doesn’t suit you for some reason, while we wait for an opaque typedef to be part of the C++ standard.

One alternative that is easy to adopt is to use structs as strong types. It offers most of NamedType functionality and type safety, while being easier to understand for novice C++ programmers — and some compilers.

If you have questions or comments, I would enjoy reading them! Post them below, or contact me on Twitter.

Related articles:

Share this post! Don't want to miss out ?