Historically, multithreading has always been a double-edged sword. It saves a lot of time and resources by speeding up heavy computations (by parallelising them), but on the other hand — it can easily introduce countless situations of unpredictable behaviour, often difficult to reproduce, debug, or even identify before they actually bring down your application.

These problems are called data races and they occur when at least two instructions from different threads access the same memory location, and at least one of these has write access, while the others read, and none of these operations are being controlled by synchronisation mechanisms. The order of such “accesses” is non-deterministic: sometimes write occurs before others read, sometimes after, and other times — while two (or more) threads read.

Fortunately, it’s almost impossible to write such dangerous code in Rust. It’s in the language’s motto — it’s been constructed in a way that guarantees thread safety.

Rust enables safe concurrency with:

This article is all about using the latter for safe & effective multithreading.

Message-based communication over channels — multi-producer, single consumer.

Rust encourages us to share memory by communicating (instead of communicating by sharing memory). In order to establish communication between threads, we need a channel — and std::sync::mpsc was created exactly for that.

Each channel has two ends — sender and receiver. It is also unidirectional — the messages can only be passed from the sender to the receiver, never other way around. What is specific to MPSC channels, is that there can be many senders (message producers), but there’s always only one receiver (consumer).

Each MPSC channel has exactly one receiver, but it can have many senders.

Channels are a great choice when the problem can be split into n smaller sub-problems. For example, imagine that we need to find out how many times a given word occurs in an extremely long text — we can easily split the text into n smaller chunks, pass these chunks to n worker threads (each keeping it’s own message producer of our “main” channel) and do the counting in parallel. Eventually, a message from each producer is transferred over the channel, and when combined in the end, it gives an answer for our initial question. This divide&conquer-type of strategy is way more efficient than sequential searching&counting on the entire text with a single thread of execution. Moreover, sending data down the channel transfers ownership of that data to the receiver, which is one of the building blocks of safe concurrency in Rust. If you are not familiar with ownership concept in Rust, it basically means preventing illegal memory access & data races, checked at compile time.

Let’s now see how to create a simple channel and send one message over, yet without introducing multiple threads:

A very simple channel.

The mpsc::channel function returns a tuple consisting of a sender and a receiver. Then, in line 6, a message is sent (both sending/receiving returns a Result ; for the sake of simplicity we assume that no error should happen in such simple example — thus why unwrap is used here). A message is finally received in line 8 and printed out.

So how can we have more than one sender (message producer)? That’s easy — simply by calling clone() method on the original sender. We’ll see that in action shortly, while passing these senders to worker threads.

Let’s now use threads & channels to solve some computation-heavy problem.

Finding a hash of given difficulty with multiple threads of execution

While searching for an occurrence of a word in a text technically could be parallelised, it not only does not give much of performance boost (except for really, really huge texts), it’s also… quite boring problem.

For the purpose of this article, let’s write a very simple implementation of the idea behind cryptocurrencies mining algorithm (so-called “proof of work”), which attempts to solve an interesting cryptographic challenge.

At a very basic level, given the “base” constant number (let’s call it BASE), the task is to find a number x, such that when two are multiplied and hashed (using SHA-256 hashing algorithm), the resulting string ends with a desired sequence of characters. For simplicity, let’s assume that we search for the solution within the range of 64-bit unsigned integer type .The pseudocode below is worth a thousand words:

The "000000" in line 4 defines the difficulty of the task (and believe me, adding just a one more zero to the sequence may increase the difficulty dramatically). If you are curious, the solution (value of x) for the problem above is 3305951 and it produces a 2a44903ffc6affe69d514ffe47721cc3a6475cbb43b37538686f2c5b46000000 hash (note the trailing sequence of zeroes).

It means, that in order to find a solution, our theoretical algorithm had to run 3305951 times, each time hash ing the numbers (which becomes a bit expensive operation if run that many times). Quite a lot for a single-threaded, sequential search.

Let’s now think about how this could be sped up by dividing the problem into smaller pieces and passing them to, say, four worker threads for parallelisation. We want to make sure that:

each integer is checked only once (no more than one thread examines the given number);

as soon as the solution is found by either of the worker threads, it is passed through the channel to the main thread; additionally other threads are notified to stop their work (more about that later);

In order to achieve the first goal, we’ll simply tell thread-0 to start with number 0 and then verify every 4th integer: 4, 8, 12 and so on. On thread-1 we’ll start with 1 and proceed with 5, 9, 13 (and so on). On thread-2 we’ll start with 2 and proceed with 6, 10, 14… see the pattern? That way we can be sure that every possible integer is checked exactly once per application run.

Another observation quickly emerges: all these worker threads fit perfectly to become message producers!

Our main thread can receive a message from any of the worker threads, notifying that a solution has been found!

Implementation

(Full code example is available on github; please note that there are very minor differences between independent gists below and the real codebase, mainly to make the gists more readable.)

main function

Let’s take a look at what’s going on in the main function:

We start with creating a “semaphore” called is_solution_found of type AtomicBool on line 2. This is a special type of boolean primitive, which can be safely shared between threads. Because we are going to share it, indeed, we need to additionally wrap it in Arc , which stands for atomic reference-counting pointer. This is one of the ways Rust makes concurrency safe, and if we tried to pass the “bare” AtomicBool to more than one thread, the compiler would complain, because Rust’s famous ownership rules would be broken. Arc allows us to share, in a thread-safe manner, the ownership of the underlying data by keeping an internal count of pointers to the value. When the last pointer is dropped, and the count reaches zero, the underlying data is dropped as well.

On line 3, a channel is instantiated, returning its’ sender and receiver ends. We then spawn four threads (0 to 4, exclusively) on lines 5–11, giving them an order to search_for_solution (that function is covered in a next section).

Finally, on line 13 we listen for a message incoming from worker threads: the receiver.recv() is blocking the main thread, but that’s exactly what we want in that case. When the solution is found, it is going to arrive on line 14, within Ok case.

search_for_solution & verify_number functions

search_for_solution iterates over every fourth number using range (line 5) with nightly compiler’s feature of step_by (the name of that method is self-explanatory). In the loop’s body, a verify_number function is called, that does the actual nitty-gritty logic of checking the hash result (as described earlier).

If the solution is found, a semaphore is_solution_found is updated with a true value (remember as we said earlier, that the AtomicBool can safely be shared between many threads). AtomicBool 's store (and load ) methods additionally take the Ordering attribute, which gives some clever hints to compiler regarding optimisations, but this is beyond the scope of this article. A notification message is then sent down the channel on line 8. Because sender.send(msg) returns a result, we pattern-match against the result and assume that there is nothing more to do when it succeeds. In the end, we return from the function and call it a day.

If, however, given number is not a solution to our cryptographic problem (which, as you correctly expect, will happen most of the time), two things can happen:

…every 1000 iterations (line 13) we additionally check the current value of the is_solution_found semaphore. If the solution has been found (by other worker thread, obviously), we return from the function, because there’s simply nothing more to do. Now, you might be asking yourself, why on earth perform the check once per 1000 iterations, and not every single time? Well, using atomics comes with some performance penalty, and attempting to read a value on each loop’s turn would negatively affect the speed of our application. With great power comes great responsibility, as they say. But if the solution has not been found yet…

semaphore. If the solution has been found (by other worker thread, obviously), we from the function, because there’s simply nothing more to do. Now, you might be asking yourself, why on earth perform the check once per 1000 iterations, and not every single time? Well, using atomics comes with some performance penalty, and attempting to read a value on each loop’s turn would negatively affect the speed of our application. With great power comes great responsibility, as they say. But if the solution has not been found yet… …we simply increment the loop turns’ counter on line 16.

And here comes the last detail — the verify_number function:

Sha256::hash comes from an easy_hash crate.

As said earlier, this is the very core of the logic behind solving the cryptographic problem. Sha256::hash from the easy-hash crate is used to calculate the hash of the multiplication result. If the given number solves a problem (line 8, 9), a Some(Solution(number, hash)) is returned. If it doesn’t, None pops out.

Summary

As we saw, Rust’s channels are excellent choice every time the task can be divided into smaller subtasks and the eventual result can be aggregated in one place. Single consumer, plus data ownership transfer (while passing messages) guarantee the safe concurrency model.

We additionally glimpsed at the atomics, which are thread-safe versions of the primitive types. Combined with atomic reference-counting pointers ( Arc ), they are a very powerful tool. But all of this wouldn’t be as useful without the help of the compiler equipped with data ownership rules!

I encourage you to check out the full code example. Try to manipulate the values — change the DIFFICULTY , BASE and/or THREADS . You will need a nightly Rust compiler, though, but the instruction is in the readme file, so be sure to read that!

And last, but not least, special thanks to HadrienG from Rust programming language forum for reviewing my code and sharing priceless hints!

Further reading