My name is Marc Brooker. I've been writing code, reading code, and living vicariously through computers for as long as I can remember. I like to build things that work. I also dabble in brewing, cooking and skiing.I'm currently an engineer at Amazon Web Services (AWS) in Seattle, where I lead engineering on AWS Lambda and our other serverless products. Before that, I worked on EC2 and EBS.All opinions are my own.

Ice Cream and Distributed Systems

When I was a child, I really liked to eat ice cream. It's still pretty great, but back then I was somewhat fanatical about it. My parents knew that the delicious mix of fat and sugar was best served only occasionally, and would carefully limit my intake. I, of course, found a way around the system. First, I'd go to my mother and ask if it was time for ice cream, and she would give me an answer. If she answered in the negative, I'd ask my father the same question. That strategy increased by chances of an affirmative answer, because the decisions that my parents made were not consistent. Occasionally, I'd even eat some served by my mother, and then try my father for a second bowl.

After a while of running this scheme, my parents figured it out. They decided that they needed to give me a consistent answer, and the only way to do that was to talk to each other every time I asked the question. Their coordination approach worked great. It guaranteed a consistent answer, and only made young Marc wait a little longer for his question to be answered.

It all broke down when my parents went to work. Being a child, I could find a good excuse to speak to either parent at any time, but their jobs prevented them from speaking to each other. Once again, I could use the situation to my rich, sweet, and creamy advantage. With my parents unable to communicate, I was able to force an inconsistent decision. Did my parents miss a trick that would have allowed them to make a consistent ice cream serving decision without being able to talk to one another?

Assume that the network consists of at least two nodes. Thus it can be divided into two disjoint, non-empty sets: {G1, G2}. The basic idea of the proof is to assume that all messages between G1 and G2 are lost. Then if a write occurs in G1, and later a read occurs in G2, then the read operation cannot return the results of the earlier write operation.

Assuming that my parents didn't have watches, and had to make the decision based only on the messages they have received and their internal state, Gilbert and Lynch proved that they can't have made a consistent decision in general. That's a general result about writes and reads. Could they do better in this specific case?

Getting crafty with clocks

Around that time, my parent's ice cream policy was that I got one bowl a week. They met before work every day when I hadn't reached my weekly allocation, and decided that for the next eight hours only one of them could hand out an ice cream decision. If it was my mother's day, and I called my father for a decision, he told me that he couldn't give me one. As long as I can could contact my mother, I could get a consistent answer. If I couldn't reach my mother, I was out of luck. The one saving factor, though, was that if mom worked late, dad would notice the eight hours had expired and make a decision.

Soon, being the crafty young man I was, I realized that dad's watch ran slightly faster than mom's. When he got home, I'd go to my dad and ask for a serving of vanilla. He'd look at his watch, see that eight hours had expired, and assume that my mom had lost the authority to make the decision. After checking that the bowl in the freezer was still full, to make sure my mom hadn't decided during the day, he'd allow me to eat. I'd wolf down my frozen treat, and call my mom. Her slower watch told her the eight hours weren't up yet, and she'd give me the go ahead for a second bowl. I'd beaten the system!

[the client's lease time] is shortened by ... the allowance E for clock skew. As a minimum, the correct functioning of leases requires only that clocks have a known bounded drift

If my parents had read Gray and Cheriton, they would have known how to fix their lease protocol. My mom and dad would have had to measure the skew between the rate of their watches, and added some time (E) to the time my dad waited before assuming my mom didn't have the lease anymore.

Putting the results back together

Thanks to a diet fad sweeping the nation, my parents decided that ice cream wasn't as bad as they assumed. Being responsible parents they still wanted to track my consumption to check their hypothesis. During the work day, my mom and dad went back to making inconsistent decisions, and each just kept their own records of how often they said yes. Once they were home together again, they could add up their numbers to get an accurate total.

Tracking flavors was a little bit harder. Every time I called to request a scoop, they would write down which flavor I got permission for. Occasionally I'd go to the freezer and find that flavor was out, and I'd call and ask to reduce my tally. Being a small child with a short memory, I couldn't remember if mom or dad had recorded the yes, so I'd call one at random to record the no. That didn't matter, because they could still total their independent counts at the day's end to get an accurate tally. An accurate tally, that is, until disaster struck.

I'd received permission from mom to eat some strawberry gelato, but found none in the usual place between the ice trays and frozen juice. I called her back to report the failure, but the line dropped before I could say goodbye. Distraught at being rudely disconnected from my mother, I called my dad to report the same thing. When the tally was done at the end of the day, my parents were baffled by the count of negative one. Had we invented anti-gelato?

My parents unfortunately hadn't kept track of developments in conflict-free replicated data types. If they had, they could have solved this problem with an OR set, by tracking additions and removals with unique tags. If they had been armed with that paper, and research on CRDTs new and old, could they have gone back to restricting my intake? The intuition is obvious: if we can count something independently, and we can manage a set independently, can we enforce one bowl per day independently? Unfortunately not. The important difference is that add one and add to set are commutative, while reduce by one if the count is greater than zero isn't.

Getting everybody to agree

Following their frustrating battle with flavor tracking, my parents asked their part-time housekeeper Mary to help with the problem. After losing faith in fad diet books, my parents both dedicated part of their work day to investigating the health properties of ice cream, and frequently changed their opinion. They also wanted to keep careful track of how much I was served, in case the dosage turned out to be important to my health. Mom and Dad agreed with Mary that she could allow me to eat some as long as all of them agreed when I asked. Mary was happy with this, but there was one big problem: she didn't like talking on the phone. Fortunately, she loved to send text messages. Unfortunately, texts were still strangely expensive back then.

Mary, Mom and Dad sat down and tried to figure out how to all agree on the problem with the fewest number of messages. Mary invented a simple scheme: when I asked her if I could have some ice cream, she messaged both my mom and dad and ask for their opinion, while asking that they didn't change their opinion until hearing back from her. If they both agreed, she'd go ahead and let them know she was going to serve dessert. If either said no, she let them know that the bowl would remain empty. The protocol, which they called two-phase commit after the frozen and liquid phases of ice cream, took four messages to complete. Could Mary do better and save some money on text messages?

Any commitment protocol ... requires at least 2(n - 1) messages to commit a transaction in the absence of processor failures.

Luckily for them, my parents didn't waste too much time thinking about the problem. Mary came across a paper from Cynthia Dwork and Dale Skeen which laid out what she needed to know. As long as Mary was sending text messages, there was no way to do better than her protocol.