Datomic and CAP theorem (2017)

This was written before Datomic Cloud and Datomic Ions , but the underlying ideas are the same.

When I speak about Datomic, my most frequent questions are about CAP theorem, which Datomic seemingly subverts to do seemingly impossible things.

Here are notes from a reddit thread I made to tease these ideas out. r/clojure: Datomic and CAP theorem (2017)

My questions:

Is it accurate to say CAP is about distributed reads and has nothing to do with writes?

I think Datomic is strongly Consistent, Partition tolerant, and sometimes Unavailable. Is this statement objectionable and how can it be stated better?

CAP theorem is no longer an effective razor

CHPILL: Indeed, more and more people are arguing against using the CAP theorem to reason about distributed data systems. See Kleppmann: Please stop calling databases CP or AP

Okay, but lets try anyway.

Cognitect statements on CAP and ACID?

ACID

Here is my definition of ACID, summarized from Cognitect's explicit definition of ACID

DUSTIN GETZ: ACID transactions are properties of a single-writer process handling writes, and has nothing to do with reads.

CAP

I've never seen Stu or any agent of Cognitect take a stance on Datomic and CAP. Probably because it's often misunderstood. The closest direct quote about CAP I have ever seen is:

STU: A Datomic system is defined as one transactor and N readers. Potential transactors coordinate so that only one can write at a time. In CAP parlance, Datomic has a traditional strongly-consistent model. source

At the bottom of the Datomic ACID page linked above, under heading "implications":

Datomic provides strong consistency of the entire database, while requiring only eventual consistency for the majority of actual writes. ... Another way to understand this is to consider the failure mode introduced by an eventually consistent node that is not up-to-date yet. Datomic will always see a correct log pointer, which was placed via conditional put. If some of the tree nodes are not yet visible underneath that pointer, Datomic is consistent but partially unavailable, and will become fully available when eventually happens.

What does that even mean?

They're saying that you can transact "origin/master is now is now abcd123" without abcd123 being available on all the developer machine checkouts. Datomic will fan-out new transactions to peers as they happen, which takes 1-10ms since same datacenter. If you need it sooner you can fetch-and-block for the 10ms, and this works because all queries have a time-basis so you know exactly what to fetch.

So in CAP, I think Datomic is CP reads

(Because CAP only applies to distributed systems, and Datomic is a single-writer system so CAP is not interesting to apply to the write side?)

strongly consistent (due to all queries having a time basis, like git),

partition tolerant (reads served from cache like git and cache is strongly consistent due to queries having time basis like git),

sometimes unavailable (the thing you need may not be in cache, like git)

How can we better word this?

Kleppmann: Stop calling databases CP or AP

The CAP theorem is too simplistic and too widely misunderstood to be of much use ... Within one piece of software, you may well have various operations with different consistency characteristics ... A huge amount of subtlety is lost by putting a system in one of two buckets (AP and CP)

The only fault considered by the CAP theorem is a network partition. but it’s not the only kind of thing that can go wrong: nodes can crash or be rebooted, you can run out of disk space, you can hit a bug in the software, etc

CAP theorem says nothing about latency, which people tend to care about more than availability

15 years later, we now have a much greater range of tools with different consistency and fault-tolerance models to choose from

Kleppmann key idea: linerizable

A frequent misconception Datomic beginners have is whether Datomic is strongly consistent. A more articulate phrase Kleppmann uses for this is "linearizable"

CHPILL: My understanding is that a Datomic system (eg a transactor and multiple peer) is not a linearizable system by default. There is always the possibility for a peer to lag behind in its consumption of the transaction log. But because Datomic exposes its monotonic time with the transaction ids, its actually very easy to make linearizable reads and writes if you need to using sync.

I think this is not quite right.

Yes, I think Datomic is always linearized

The key is that all questions have a basis-time.

DUSTIN GETZ: The assertion is: Datomic is not cap-C (linearized) by default, though it provides sync API for this when necessary. I am not sure I agree with this, because all Datomic queries are of the form "question Q at time T". Datomic doesn't have an operation for "question Q at time whatever-the-peer-has". So Bob and Alice are actually asking different questions. If Alice tells bob out-of-band "the time is T+1 and and this is who won", and Bob asks Datomic "who won at time T+1" these requests are linearized. The question "what is latest T" (d/sync) is also linearized. Thus Datomic is linearized always. It's the webservice that broke linearization by removing explicit T from all the questions and implicitly setting T to "whatever the peer has". (This is a perfectly acceptable tradeoff but it was done in the app, not the db)

Please write me on Clojurians slack #hyperfiddle, or any other medium, if you'd like to talk about this or have ideas how to make this more articulate.