Being smart about ownership

4,803 reads

Understanding smart pointers using a better framework

You can read this article on my own blog, if you prefer that to medium.

In 2011 the C++ standard introduced unique_ptr, shared_ptr, weak_ptr and shared_ptr’s atomic counterpart. All the major distributors of the standard library immediately and flawlessly implemented all six of these pointers.

By early 2012, every C++ developer quickly read up on these new smart pointers and understood their importance and their usecases. This was swiftly followed a by massive refactoring of most open source C++ libraries and projects to nullify the use of raw pointers. Private companies, personal codebases and teaching material soon followed suit.

In 2018 someone learning C++ would hardly know what a raw pointer is until he’s been working with the language for months. They are still used in various edge cases, such as some high performance data structures and for various compatibility layers. However, most applications have switched to using only smart pointers and using them correctly at that. Thus making our code safer, easier to read, modify and reason about.

So what went wrong ?

Well, a lot of things went wrong, I don’t have time to repeat them all here. However, I put it to you, dear reader, that one of the reason why smart pointers are underused and shared_ptr is often abused, is that we misunderstood the main concept behind them.

Smart pointers are about ownership

A lot of people seem to think of unique pointers as being a mechanism to deffer free for raw pointers. Even more people view shared_ptr as being C++’s “garbage collected” pointer, to be used and abused the same way one would treat a JVM object reference. Smart pointers do indeed free a resource when it’s no longer needed, however, the important part of that sentence is not the “free” part but the “when it’s no longer needed” part.

Smart pointers represent a way to express ownership over a resource. They are a tool that helps us and future developers understand how we are using a resource. By doing that, they also happen to help us avoid use-after-free and memory leaks, but those things are a side effect of their real purpose.

Unique pointer

The std::unique_ptr has practically no overhead and a very predictable behavior. The underlying resource is destructed when it exists the scope, it can be moved but it can’t be copied. It’s intended to represent a resource that is owned by a single subroutine but needs to be dynamically allocated. Usecases include having to move around an expensive to copy resource with not mctor, passing a resource between threads, various implementations of singletons or pimpl and compatibility with C libraries which use pointers.

This pointer is particularly nice because you can apply the same reasoning to it as you would apply to a value declared on the stack. It will actively prevent you from running into use after free errors or mistakenly sharing it with other threads.

Shared pointer

This one is a bit trickier. It’s expensive and it’s dangerous to use since it expresses ownership in a much “looser” way. It’s meant mainly for multi-threaded resource sharing and gives you the guarantee that the underlying resource hasn’t been freed by another thread. This pointer is especially amazing if you have a function or method which must return a shared resource, since the caller is immediately mad aware of that, it can also be used to implement a lot of higher level state-sharing mechanisms.

Shared pointer’s biggest problem is that it often gets used instead of a unique pointer or a weak pointer, leading to confusing code and poor performance.

Weak pointer

Finally we get to std::weak_ptr, probably the most underused of the three. This pointer should be used for referencing a shared resource which has a lifespan that is not controlled by the pointer’s owner. It can be quite useful when an external entity wishes to expose an object without giving you full ownership of said resource.

The obvious problem with this pointer is that people might be tempted to immediately turn it to a shared pointer and keep using that, instead of releasing ownership after each use. Conveniently, the most obvious usage pattern of this pointer will stop people from doing just that:

In essence, this pointer is the closest you will get to a raw pointer. However, it provides a smart safety mechanism, forcing a nullptr check combined with temporary shared ownership when you wish to use the resource it holds. Thus preventing you from unknowingly using a dangling pointer.

Finally, some code

One thing that I find surprising is how few code examples comparing usage of raw pointers to usage of smart pointers there are out there. So here’s an example to illustrate how all three ownership semantics help when compared to a raw pointer approach:

So the obvious answer here is that we should read up on the documentation of this Widget library and understand how to use get_widget. However, documentation can be wrong, it doesn’t have a compiler to proof read it. Documentation can also be missing, outdated or we may simply misinterpret it due to a variety of factors.

What if there was a way for the writer of the library to communicate to use the intended usage patterns for this widget via something we (a C++ developer) are bound to read and understand ? The interface of the library itself. Lets see how that would go:

Exhibit A

Let’s assume that once we request a widget it’s ours to use and abuse. The library gives us a pointer rather than the object itself because of reasons other than ownership.

In that case, we’d just use a unique_ptr.

Exhibit B

But what if the author of the library has multiple users in mind. Maybe widgets are really expensive to construct or maybe they are a direct interface to a singular hardware device that must be shared ?

Exhibit C

What if I don’t want the widget user to really “own” the widget at all ? My library should decide when and if the widget is destroyed… within reason, of course (e.g. don’t destroy it whilst it’s being used).

There you go, plain and simple examples of how ownership semantics are represented via the 3 types of smart pointers.

Obviously, smart pointers don’t cover all edge cases, especially when it comes to efficient resource management in lock-free code. However, they cover most of the common ones.

This, in my opinion, is how we should look at smart pointers, through the lens of ownership, rather than RAII.

If you enjoyed this article you may also like:

Tags