Atoms by experiment January 30, 2014

Previously, I wrote about how Clojure separates values and identities using reference types, and why that’s cool from a conceptual standpoint.

But I glossed over how reference types are actually used. And I ignored the elephant in the room, the problem that motivates reference types’ design: concurrency.

In this post, I’ll talk about atoms, the simplest reference type Clojure has, and how they make basic concurrency work right.

In pseudocode

When you combine mutable state with concurrency, you have a problem. It’s far too easy to get into an inconsistent state. Consider this pseudocode for incrementing a counter:

1. Let c = the current value of the counter. 2. Set the value of the counter to c + 1.

What happens when multiple cores are executing this code at the same time?

Thread A runs step 1, lets c = 3. Thread B runs step 1, lets c = 3. Thread A runs step 2, sets counter to 4. Thread B runs step 2, sets counter to 4.

Oops! We tried to increment twice, but the counter only went up by one.

In Clojure, with atoms

You can write the above code in Clojure and actually observe its bugginess.

If you’ve read my previous post or messed around with Clojure, then you know an atom is like a mutable container for a (usually immutable) value. You create and dereference it like this:

user=> (def a (atom [1 2 3])) #'user/a user=> @a [1 2 3]

And the easiest way to change it is reset! :

user=> (reset! a [4 5 6]) [4 5 6] user=> @a [4 5 6]

OK, with prerequisites out of the way, let’s create an atom to serve as our counter, initially 0.

(def counter (atom 0))

To make concurrency problems easier to detect, we’ll write an increment function just like inc , but with a built-in delay (and a println ):

(defn increment-with-delay [x delay] (Thread/sleep delay) (println "Incrementing" x "after a delay of" delay "ms") (inc x))

Now let’s create two threads that each update the counter, one with a longer delay than the other:

(do (future (reset! counter (increment-with-delay @counter 250))) (future (reset! counter (increment-with-delay @counter 500))) (Thread/sleep 1000) (println "The counter is now" @counter))

Paste that code in a REPL. Seriously, go do it, I’ll wait!

Done? Great. So you probably saw output like this:

Incrementing 0 after a delay of 250 ms Incrementing 0 after a delay of 500 ms The counter is now 1

The first line is the fast thread. The second line is the slow thread.

When the slow thread reads the counter, it’s 0. But by the time it writes its result, the fast thread has updated the counter to 1. The slow thread ignores this value, and we end up with an inconsistent state.

Swapping, for consistency

So we see the problem I described is real.

Fortunately, we can fix it. To prevent multiple threads leaving the atom in an inconsistent state like this, Clojure provides swap! . Recall that a was [4 5 6] before:

user=> (swap! a conj 10) [4 5 6 10] user=> @a [4 5 6 10]

Here, swap! is called on our atom a , the function conj , and the extra argument 10 . The atom’s new value is (conj current-value-of-a 10) .

Let’s try incrementing a counter again, but this time use a swap. First we’ll reset the counter to zero. (Setting an initial value like this is an appropriate use of reset! .)

(reset! counter 0)

Now we’ll run the same testing code as before, with swap! instead of reset! .

(do (future (swap! counter increment-with-delay 250)) (future (swap! counter increment-with-delay 500)) (Thread/sleep 1000) (println "The counter is now" @counter)) ; output: ; Incrementing 0 after a delay of 250 ms ; Incrementing 0 after a delay of 500 ms ; Incrementing 1 after a delay of 500 ms ; The counter is now 2

The first line is the fast thread. The second and third lines are the slow thread. This time, the slow thread notices its data is stale, and retries rather than writing an inconsistent value.

Note that because retries are possible, the update function you provide to swap! should be side-effect-free. We cheated and called println , which is a side effect, but we did that on purpose so we could observe the retry.

The role of atoms

Clojure has quite the zoo of reference types — aside from atoms, there are also refs, agents, and vars. So where do atoms fit in?

Atoms provide consistent access to a single identity. Changes to an atom are, well, atomic: either the entire change occurs or none of it. Changes are also consistent and, if you only make them with swap! , isolated — the A, C and I properties of ACID.

However, all this only holds true when you’re dealing with a single atom. If you update two atoms, concurrent execution can still cause problems — another thread may see the first atom updated, but not the second.

In the next post, Playing with refs, I show an example of this problem and how refs, which allow coordinated access to multiple identities, can prevent it.