Lightweight but still STL-compatible unique pointer Mag­num got a new unique point­er im­ple­men­ta­tion that’s much more light­weight with bet­ter de­bug per­for­mance and com­pile times, but is still ful­ly com­pat­i­ble with std::unique_p­tr.

Content care: Jan 28, 2019 Links to sin­gle-head­er libs men­tioned at the end of the ar­ti­cle were up­dat­ed to match the cur­rent state.

Mag­num is cur­rent­ly un­der­go­ing an op­ti­miza­tion pass for short­er com­pi­la­tion times and small­er bi­na­ry sizes, fur­ther im­prov­ing on what was done back in 2013. Back then I man­aged to re­duce amount of tem­plate in­stan­ti­a­tions and re­move use of the then-most-heavy #include s such as <iostream> or <algorithm> from head­ers, while al­so ban­ning sev­er­al oth­ers such as <regex> or <random> from ev­er leak­ing there. Back then, with a 2013 hard­ware and GCC 4.8, that re­sult­ed in com­pile times be­ing down from 5:00 to 2:59, which was al­ready a sig­nif­i­cant im­prove­ment.

Nowa­days Mag­num com­piles with all tests in 80 sec­onds. The code­base got con­sid­er­ably big­ger dur­ing the past five years, but Moore’s law is al­so still in ef­fect, so one could say that the best so­lu­tion for im­prov­ing com­pile times is to just wait a bit. (Sim­i­lar­ly as the best way to cure in­som­nia is to get more sleep.) But, es­pe­cial­ly af­ter see­ing what’s pos­si­ble with plain C, I’m not com­plete­ly hap­py with cur­rent state and I think I could do bet­ter.

~ ~ ~

The prob­lem with re­mov­ing the most sig­nif­i­cant caus­es of slow­down is that some­thing else steps up to be­come the most prob­lem­at­ic. Now things like <vector> or <string> are among the top of­fend­ers and, apart from re­plac­ing std::vec­tor with Mag­num’s light­weight Con­tain­ers::Ar­ray where pos­si­ble, the most ef­fi­cient cure is to PIM­PL the class in­ter­nals to re­move them from class def­i­ni­tions. That works for al­most ev­ery­thing. Ex­cept for std::unique_p­tr, be­cause that one is of­ten used to wrap the PIM­PL it­self since you def­i­nite­ly do not want to im­ple­ment copy/move con­struc­tors for each PIM­PL’d class in­stead.

To my great sur­prise, the <memory> head­er is quite a beast, twice as big as <vector> (which, well, has to han­dle all the com­plex move-aware re­al­lo­ca­tions) and it on­ly gets worse with new­er C++ stan­dards. It’s ac­tu­al­ly even slight­ly big­ger than <iostream> which I banned for this very rea­son!

Be­low is a graph of pre­pro­cessed line count for each head­er, gen­er­at­ed us­ing the fol­low­ing com­mand with GCC 8.2. Note the use of -P , which re­moves un­nec­es­sary #line state­ments from the pre­proces­sor out­put, mak­ing the re­sult­ing line count more cor­re­spond­ing to the amount of ac­tu­al code. The last line in the plot, for com­par­i­son, is us­ing Clang 7.0 with libc++. While pre­pro­cessed line count is not the on­ly fac­tor af­fect­ing com­pile times, it cor­re­lates with it pret­ty well, es­pe­cial­ly in tem­plate-heavy C++ code.

echo "#include <memory>" | gcc -std = c++11 -P -E -x c++ - | wc -l

8608.0 lines 14652.0 lines 17839.0 lines 17863.0 lines 20995.0 lines 16736.0 lines 0 2500 5000 7500 10000 12500 15000 17500 20000 lines <vector> <vector> + <string> <iostream> <memory> <memory> <memory> libstdc++, C++17 libc++, C++2a Preprocessed line count

Let’s step back a bit and try again Im­pos­ing the bur­den of 17k lines on ev­ery us­er of the class would ab­so­lute­ly de­stroy any ben­e­fits of PIM­PLing away the <vector> and <string> in­cludes, as the <memory> head­er alone is big­ger than those two com­bined. The crazy part is that it’s just a move-on­ly wrap­per over a point­er. The new Con­tain­ers::Point­er is al­so just that, but in a rea­son­ably-sized pack­age. Un­like std::unique_p­tr it doesn’t sup­port ar­rays (Mag­num has Con­tain­ers::Ar­ray for that) and at the mo­ment it doesn’t have cus­tom deleters, as there was no im­me­di­ate need for this fea­ture. On the oth­er hand, it pro­vides an equiv­a­lent to std::make_unique() with­out forc­ing you to use C++14. It’s named just Pointer , be­cause I al­ready have an Array and I don’t ev­er plan on im­ple­ment­ing an al­ter­na­tive to std::shared_p­tr, be­cause, in my opin­ion, the on­ly pur­pose of that type is mak­ing cod­ing crimes eas­i­er to com­mit. Let’s look at it again: 2311.0 lines 2769.0 lines 17863.0 lines 21014.0 lines 0 2500 5000 7500 10000 12500 15000 17500 20000 lines <Containers/Pointer.h> <Containers/Pointer.h> <memory> <memory> C++11 C++2a C++11 C++2a Preprocessed line count It could be small­er, but I need­ed <type_traits> to do some con­ve­nience com­pile-time checks (one of them is for­bid­ding its use on T[] ). And for in-place con­struc­tion us­ing Con­tain­ers::point­er(), I need­ed std::for­ward() from <utility> . I could have used static_cast in­stead and saved my­self ~700 lines of code, but the head­er is so es­sen­tial that you’ll be in­clud­ing your­self soon­er or lat­er any­way.

Com­pile times and de­bug per­for­mance For a “mi­crobench­mark” of com­pile times, I cre­at­ed the fol­low­ing two code snip­pets and com­piled each with GCC 8.2. For bet­ter sense of scale, there’s al­so a base­line time, which is from com­pil­ing just int main() {} with no #include at all. #include <Corrade/Containers/Pointer.h> using namespace Corrade ; int main () { Containers :: Pointer < int > a { new int {}}; return * a ; } #include <memory> int main () { std :: unique_ptr < int > a { new int {}}; return * a ; } By de­fault, Con­tain­ers::Point­er has a con­ve­nience print­er for Util­i­ty::De­bug and al­so pro­vides hu­man-read­able as­ser­tions us­ing the same util­i­ty. To make the com­par­i­son more bal­anced, I opt­ed-out of de­bug print­ing and switched to stan­dard C assert() by defin­ing CORRADE_NO_DEBUG and COR­RADE_­S­TAN­DARD­_ASSERT on the com­pil­er com­mand line. The re­sult­ing times are be­low: g++ main.cpp -DCORRADE_NO_DEBUG -DCORRADE_STANDARD_ASSERT -std = c++11 # or c++2a 49.97 ± 0.54 ms 69.74 ± 3.04 ms 71.41 ± 0.84 ms 205.19 ± 1.05 ms 249.01 ± 4.72 ms 0 50 100 150 200 250 ms baseline <Containers/Pointer.h> <Containers/Pointer.h> <memory> <memory> int main() {} C++11 C++2a C++11 C++2a Compilation time, GCC 8.2 Re­gard­ing de­bug per­for­mance, check­ing on Com­pil­er Ex­plor­er, std::unique_p­tr re­sult­ed in rough­ly four times as many in­struc­tions as for Con­tain­ers::Point­er in a non-op­ti­mized ver­sion on both Clang and GCC. GCC with -O1 and high­er was able to re­duce the above snip­pet to a pair of new and delete , Clang with -O1 short­ened the code to rough­ly half for both (but still with 3x dif­fer­ence) and Clang -O2 and up man­aged to get rid of the al­lo­ca­tion al­togher­her in both cas­es, which is nice.

What if my li­brary al­ready us­es std::unique_p­tr? Mag­num will be grad­u­al­ly switch­ing to the new type in all APIs, but be­cause I don’t want to make your life mis­er­able, the type is able to im­plic­it­ly morph from and back in­to std::unique_p­tr. A sim­i­lar trick is al­ready used in the Mag­num Math li­brary for ex­am­ple for the GLM math li­brary in­te­gra­tion. The con­ver­sion is pro­vid­ed in a sep­a­rate Cor­rade/Con­tain­ers/Point­er­Stl.h head­er be­cause, well, do­ing it di­rect­ly in the class it­self would re­quire me to #include <memory> — which I want­ed to avoid in the first place. As a side-ef­fect of this, it al­so al­lows you to have an equiv­a­lent of std::make_unique() in C++11 — Con­tain­ers::point­er(): #include <Corrade/Containers/PointerStl.h> using namespace Corrade ; int main () { std :: unique_ptr < int > a { new int { 42 }}; Containers :: Pointer < int > b = std :: move ( a ); std :: unique_ptr < int > c = Containers :: pointer < int > ( 1337 ); } This con­ver­sion be­haves like any oth­er usu­al move — the orig­i­nal in­stance gets re­lease()d, be­com­ing nullptr , and the own­er­ship moves to the oth­er.

The case of std::ref­er­ence_wrap­per I… I’m not even mad any­more. Just dis­ap­point­ed. Main use of this stan­dard type in Mag­num APIs is to al­low stor­ing ref­er­ences (or non-nul­lable point­ers) in var­i­ous con­tain­ers. The std::ref­er­ence_wrap­per is even sim­pler than std::unique_p­tr, yet it’s shov­eled in­to the <functional> head­er, which, while it was not ex­act­ly slim to be­gin with, it man­aged to gain an in­sane amount of weight due to (I as­sume) the in­tro­duc­tion of searchers in C++17. Like, why not put these in <search> in­stead?! So I made my own Con­tain­ers::Ref­er­ence, too (and it’s al­so con­vert­ible to/from the STL equiv­a­lent in a sim­i­lar way). 1646.0 lines 2015.0 lines 14540.0 lines 31353.0 lines 0 5000 10000 15000 20000 25000 30000 lines <Containers/Reference.h> <Containers/Reference.h> <functional> <functional> C++11 C++2a C++11 C++2a Preprocessed line count In this case I didn’t even need <utility> , so the head­er is just 1646 pre­pro­cessed lines un­der C++11. To wrap it up, here are com­pile times of the fol­low­ing snip­pets, again with the base­line com­par­i­son for bet­ter scale: #include <Corrade/Containers/Reference.h> using namespace Corrade ; int main () { int a {}; Containers :: Reference < int > b = a ; return b ; } #include <functional> int main () { int a {}; std :: reference_wrapper < int > b = a ; return b ; } 49.97 ± 0.54 ms 64.66 ± 3.49 ms 66.29 ± 4.87 ms 173.6 ± 7.38 ms 308.8 ± 7.76 ms 0 50 100 150 200 250 300 ms baseline <Containers/Reference.h> <Containers/Reference.h> <functional> <functional> int main() {} C++11 C++2a C++11 C++2a Compilation time, GCC 8.2

But, but, … mod­ules? The Mod­ules work is run­ning for half a decade al­ready and many of the head­er bloat con­cerns are be­ing hand­waved away with “mod­ules will solve that”. I looked at the pro­pos­als back in 2016, but didn’t have a chance to check back since, so I was ex­cit­ed to see the progress. TL;DR: no, we’re not there yet. While Mod­ules are said to be on track for C++20 (I hope that’s stil pos­si­ble), I was not able to find any re­al-world ex­am­ple that would work for me. Af­ter much strug­gling, I man­aged to come up with this com­mand-line: clang++ -std = c++17 -stdlib = libc++ -fmodules-ts -fimplicit-modules \ -fmodule-map-file = /usr/include/c++/v1/module.modulemap main.cpp And, af­ter in­stalling both libc++ and libc++-experimental from AUR, the fol­low­ing snip­pet com­piled cor­rect­ly. Var­i­ous ex­am­ples told me that I could import std.memory; , but that on­ly greet­ed me with an un­googleable er­ror. import std ; int main () { std :: unique_ptr < int > a { new int {}}; return * a ; } The mea­sured com­pile times are be­low, but note the very first run takes al­most two sec­onds — it’s com­pil­ing the mod­ule file, re­sult­ing in 17 megabytes of var­i­ous bi­na­ries in your temp di­rec­to­ry. And you get a dif­fer­ent set of these for dif­fer­ent flags, en­abling -O3 gen­er­ates an­oth­er set of bi­na­ries. That … feels pret­ty much like pre­com­piled head­ers. Not sure if hap­py. (I didn’t like those at all.) 82.93 ± 0.78 ms 108.01 ± 4.51 ms 279.79 ± 4.15 ms 90.86 ± 4.14 ms 0 50 100 150 200 250 ms baseline Containers::Pointer std::unique_ptr std::unique_ptr int main() {} <Containers/Pointer.h> <memory> import std Compilation time, Clang 7.0 -std=c++17 I was look­ing for­ward to C++ mod­ules to sim­pli­fy li­brary link­ing to the point where you just say “this is the li­brary I want to link to” on the com­mand line and it will feed both the link­er with cor­rect ob­ject code and the com­pil­er with cor­rect im­port­ed def­i­ni­tions. Wish­ful think­ing. This is nowhere near that and the speed gains are not that sig­nif­i­cant com­pared to re­spon­si­ble head­er hy­giene. Peo­ple with big­ger code­bas­es are re­port­ing even small­er gains, around 10%, which makes me won­der if this is worth both­er­ing with, in the cur­rent state of things. And us­ing mod­ules will not mag­i­cal­ly im­prove de­bug per­for­mance of STL con­tain­ers any­way. What’s worse is that the im­ple­men­ta­tion is nowhere prop­er­ly doc­u­ment­ed (Clang Mod­ules doc­u­men­ta­tion is not about Mod­ules TS, but their own dif­fer­ent thing) and there’s no sup­port in tools or IDEs (not to men­tion buildsys­tems), so at the mo­ment it’s very painful to work with. I think I’ll check back in an­oth­er five years.