Modern C++ Won't Save Us

I’m a frequent critic of memory unsafe languages, principally C and C++, and how they induce an exceptional number of security vulnerabilities. My conclusion, based on reviewing evidence from numerous large software projects using C and C++, is that we need to be migrating our industry to memory safe by default languages (such as Rust and Swift). One of the responses I frequently receive is that the problem isn’t C and C++ themselves, developers are simply holding them wrong. In particular, I often receive defenses of C++ of the form, “C++ is safe if you don’t use any of the functionality inherited from C” or similarly that if you use modern C++ types and idioms you will be immune from the memory corruption vulnerabilities that plague other projects.

I would like to credit C++'s smart pointer types, because they do significantly help. Unfortunately, my experience working on large C++ projects which use modern idioms is that these are not nearly sufficient to stop the flood of vulnerabilities. My goal for the remainder of this post is to highlight a number of completely modern C++ idioms which produce vulnerabilities.

Hide the reference use-after-free

The first example I’d like to describe, originally from Kostya Serebryany, is how C++'s std::string_view can make it easy to hide use-after-free vulnerabilities:

#include <iostream> #include <string> #include <string_view> int main () { std :: string s = "Hellooooooooooooooo " ; std :: string_view sv = s + "World

" ; std :: cout << sv; }

What’s happening here is that s + "World

" allocates a new std::string , and then is converted to a std::string_view . At this point the temporary std::string is freed, but sv still points at the memory that used to be owned by it. Any future use of sv is a use-after-free vulnerability. Oops! C++ lacks the facilities for the compiler to be aware that sv captures a reference to something where the reference lives longer than the referent. The same issue impacts std::span , also an extremely modern C++ type.

Another fun variant involves using C++'s lambda support to hide a reference:

#include <memory> #include <iostream> #include <functional> std :: function < int ( void ) > f(std :: shared_ptr < int > x) { return [ & ]() { return * x; }; } int main () { std :: function < int ( void ) > y( nullptr ); { std :: shared_ptr < int > x(std :: make_shared < int > ( 4 )); y = f(x); } std :: cout << y() << std :: endl; }

Here the [&] in f causes the lambda to capture values by reference. Then in main x goes out of scope, destroying the last reference to the data, and causing it to be freed. At this point y contains a dangling pointer. This occurs despite our meticulous use of smart pointers throughout. And yes, people really do write code that handles std::shared_ptr<T>& , often as an attempt to avoid additional increment and decrements on the reference count.

std::optional<T> dereference

std::optional represents a value that may or may not be present, often replacing magic sentinel values (such as -1 or nullptr ). It offers methods such as value() , which extract the T it contains and raises an exception if the the optional is empty. However, it also defines operator* and operator-> . These methods also provide access to the underlying T , however they do not check if the optional actually contains a value or not.

The following code for example, simply returns an uninitialized value:

#include <optional> int f () { std :: optional < int > x(std :: nullopt); return * x; }

If you use std::optional as a replacement for nullptr this can produce even more serious issues! Dereferencing a nullptr gives a segfault (which is not a security issue, except in older kernels). Dereferencing a nullopt however, gives you an uninitialized value as a pointer, which can be a serious security issue. While having a T* with an uninitialized value is also possible, these are much less common than dereferencing a pointer that was correctly initialized to nullptr .

And no, this doesn’t require you to be using raw pointers. You can get uninitialized/wild pointers with smart pointers as well:

#include <optional> #include <memory> std :: unique_ptr < int > f() { std :: optional < std :: unique_ptr < int >> x(std :: nullopt); return std :: move( * x); }

std::span<T> indexing

std::span<T> provides an ergonomic way to pass around a reference to a contiguous slice of memory and a length. This lets you easily write code that works over multiple different types; a std::span<uint8_t> can point to memory owned by a std::vector<uint8_t> , a std::array<uint8_t, N> , or even a raw pointer. Failure to correctly check bounds is a frequent source of security vulnerabilities, and in many senses span helps out with this by ensuring you always have a length handy.

Like all STL data structures, span 's operator[] method does not perform any bounds checks. This is regrettable, since operator[] is the most ergonomic and default way people use data structures. std::vector and std::array can at least theoretically be used safely because they offer an at() method which is bounds checked (in practice I’ve never seen this done, but you could imagine a project adopting a static analysis tool which simply banned calls to std::vector<T>::operator[] ). span does not offer an at() method, or any other method which performs a bounds checked lookup.

Interestingly, both Firefox and Chromium’s backports of std::span do perform bounds checks in operator[] , and thus they’ll never be able to safely migrate to std::span .

Conclusion

Modern C++ idioms introduce many changes which have the potential to improve security: smart pointers better express expected lifetimes, std::span ensures you always have a correct length handy, std::variant provides a safer abstraction for union s. However modern C++ also introduces some incredible new sources of vulnerabilities: lambda capture use-after-free, uninitialized-value optional s, and un-bounds-checked span s.

My professional experience writing relatively modern C++, and auditing Rust code (including Rust code that makes significant use of unsafe ) is that the safety of modern C++ is simply no match for memory safe by default languages like Rust and Swift (or Python and Javascript, though I find it rare in life to have a program that makes sense to write in either Python or C++).

There are significant challenges to migrating existing, large, C and C++ codebases to a different language – no one can deny this. Nonetheless, the question simply must be how we can accomplish it, rather than if we should try. Even with the most modern C++ idioms available, the evidence is clear that, at scale, it’s simply not possible to hold C++ right.