Lessons Learned Building Distributed Systems with CQRS and Event Sourcing

14,193 reads

@ patrickleet Patrick Lee Scott Read my story at https://patscott.io/

Several years ago, I had an idea, or more of a drive really. Maybe even an obsession. I wanted to build the holy grail of efficiency in development. It didn’t have to be for everyone. I wanted it for me.

reactions

I wanted to build highly scalable fault tolerant systems made up of simple services that were maintainable, malleable, withstood the test of time, and importantly — were easy for humans to understand.

reactions

I wanted my services to be easily testable, and I didn’t want have to think about how they ran in production.

reactions

I wanted myself and my teammates to be able to deliver high quality code quickly. When code is delivered more quickly, that means your business can experiment more quickly, and therefore, find winning ideas more quickly.

reactions

This is basically the entire point of the book “The Lean Startup”.

reactions

Build. Measure. Learn. Repeat. And so it continues, from the startup into the enterprise.

reactions

To win, you need to build, measure, learn, and repeat and repeat and repeat.

reactions

It gives our businesses competitive advantages.

reactions

And hopefully, that means everyone makes more money, which means we (or maybe me) can focus on what we really want: enjoying life, rock climbing, traveling, writing, spending time with my beautiful girlfriend, and let’s be honest, probably programming some more... or too much.

reactions

However, we cannot just be faster. We cannot sacrifice on quality.

reactions

Our code powers basically all the things, and all the things can’t be breaking all the time! Especially when I’m at the beach!

reactions

Turns out, such things are easier said than done. It’s not easy to go from reading Domain Driven Design and knowing what the hell that actually means in practice.

reactions

I’ve heard others joke that the time spent in understanding and being able to implement DDD’s techniques is equivalent to getting a PhD.

reactions

I first read DDD in 2007 when I was a sophomore in college. I wouldn’t even say I am a DDD expert now — however there are some REALLY useful concepts from DDD, and, languages of today are much more powerful and expressive than they were in 2003 when the book and its examples were written.

reactions

It’s not until now, several years later, that I’ve figured out how to do it well. And more importantly, have ran a few production systems using these techniques and formalized my views and approaches on the matter so I could share what I’ve learned with others!

reactions

Here’s my view: It is complicated, but mostly because of the signal to noise ratio.

reactions

There’s a whole lot of noise. Modeling problems aside, that still leaves literally hundreds of libraries and approaches to building microservices because it basically just means “a really small and focused service”.

reactions

I want to be the strong signal in that noise that you can follow to success.

reactions

Additionally, just having a bunch of services means a whole bunch of new headaches in general!

reactions

More services to test!

More integrations to test!

More deployments to deploy!

More databases to provision!

More caches to invalidate!

More everything!

reactions

The act of making simple services makes your architecture and operations more complex.

reactions

It just does.

reactions

It’s how it works.

reactions

However…

reactions

That doesn’t mean it needs to be complicated.

reactions

Quick analogy: Anyone remember CSS Sliding Doors Technique?

reactions

That’s back when CSS was awful and it took about 100 lines of code to make rounded corners.

reactions

Now, it’s one line of code.

reactions

Things tend to get easier over the years.

reactions

And yet, it’s still hard to find a consensus on how to build microservices!

reactions

But don’t fret, I’m here to direct you to three key areas that will help you tackle the multi-headed beast.

reactions

1. Learn more design patterns!

The next thing in software engineering is always standing on the shoulders of its predecessors. Without the minds and thoughts of thousands of engineers who came before the next set of thoughts and patterns would not be possible.

reactions

If you don’t love patterns already, well, I’m surprised you’re an engineer! If you come across a problem chances are someone has already solved it, or at least some variety of it.

reactions

Design Patterns are essential to building highly scalable, and fault tolerant systems that are human friendly!

reactions

In my every day work, I make use of all sorts of patterns all the time! Here’s several awesome patterns that come to mind: Event Sourcing, Repository Pattern, Singleton, Factory, CQRS, Circuit Breakers, POJOs, and more!

reactions

Beyond classic patterns, I’ve also come across some “microservice” patterns over my years that help to think about designing large systems for enterprises from a higher-level view.

reactions

There is no better resource that the classic “Blue Book” to get you started with the concepts. So popular it can go by a color and other engineers will know what you are talking about: Domain-Driven Design: Tackling Complexity in the Heart of Software: Eric Evans.

reactions

As far as microservice specific patterns, here are some of the most common ones I use: These Five Microservice Patterns Will Make You a Better Engineer.

reactions

Moving on…

reactions

2. Objects are important, but you know what’s also really important? Events and Commands.

Commands are how things happen. Events are what has happened.

reactions

Commands and Events are both messages.

reactions

Entities are what events happened to. Aggregates are collections of related entities.

reactions

It turns out that all of these things are really, really, important.

reactions

Oftentimes, with the focus on OO principles, you’ll hear people talking about Entities, and maybe Aggregates, but the Events and Commands are lost! This is even worse when you are just updating a database with the new state. All of the history of the world you’ve modeled are lost with every UPDATE.

reactions

I find it very sad. 😢

reactions

Before this was understood, ORMs were popular, which has been referred to as the “Vietnam of Computer Science” by some. We didn’t win the war with ORMs.

reactions

The Vietnam of Computer Science · Ted Neward’s Blog

reactions

Object-Relational Mapping is the Vietnam of Computer Science · Coding Horror

reactions

Many proponents of Domain Driven Design evolved their thinking over the years to move away from the focus on the Nouns, and began to usher in a new era of Events. Check out the works of Greg Young, Udi Dahan, and Rinat Abdullin. Even the Google Group “DDD” eventually was renamed to CQRS/ES+AR (Command Query Responsibility Segregation with Event Sourcing on Aggregate Roots)!

reactions

Events are the language of distributed systems… And life really.

reactions

When modeling the world, you need to model the Events and Commands of the world as well. Events and commands work really well as the language of a distributed system.

reactions

When I visualize systems I like imagine paper forms being filled out and passed between human actors and that’s essentially a distributed system, and an easy way to think about eventual consistency.

reactions

inventory.product.catalog

inventory.product.cataloged

reactions

bus.on(‘inventory.product.cataloged’, reactToTheFactThatThisEventHappened) `

yields

Expanding on that, I also want to introduce a very simple mathematical equation:

reactions

state = leftFold([...previousEvents])

The state is the left fold of the previous events.

reactions

For those of you who speak JavaScript:

reactions

const eventsourcing = ( events, snapshot = {} ) => events .reduce( ( state, event ) => Object .assign({}, state, event.payload), snapshot)

The events ARE a normalized, immutable source of truth for your domain.

reactions

The current state therefore is derived by applying the events on top of each other sequentially.

reactions

If you know everything that has happened in your subset of the world — your bounded context — then you can determine the state of that world.

reactions

Here’s a simple really contrived example — imagine you are building a robot which picks up and places items onto the surface of a table.

reactions

The table could be your aggregate.

reactions

Are you using a table right now? On it, is probably a laptop, or maybe a TV remote.

reactions

The context in this case is the problem at hand. That we want to know what objects are on a table and what surfaces are available for new items. Our model only needs to contain information relevant to that task.

reactions

You can imagine that if you were writing software for a warehouse, your idea of what a table is and what information you would care about might be very different.

reactions

The context is important. In DDD, this is what Evans refers to as a bounded context.

reactions

Anyway, on with the example…

reactions

Let’s command the robot to place a ball on the table.

reactions

To do this, I use a library called “servicebus”. Servicebus is really cool because it allows you to use middleware for events so you can easily add things like retry or deduping logic backed by Redis, or tracing with very little effort. It was originally built on RabbitMQ, but I’ve been working on a Kafka version as well which supports the original plugins.

reactions

bus.send( 'table.item.place' , { type : 'ball' , properties : { color : 'red' }, position : { top : 1 , left : 1 , unit : 'inch' } }

When it happens, the robot can confidently declare “I placed the ball on the table! It’s positioned 1 inch from the top, and 1 inch from the left!”

reactions

It happened.

reactions

It can’t unhappen.

reactions

It’s an immutable fact. The ball was placed on the table. Period.

reactions

Let’s let the rest of the world know, so they can respond to the event if they are subscribed.

reactions

bus.publish( 'table.item.placed' , { item })

Now, I want to stress that it is an immutable fact that this event occurred.

reactions

The newspaper has already been published and sent out the door!

reactions

If you want to undo it, your only option is another command — table.item.remove

reactions

Which would lead to the event table.item.removed to be published when successfully completed.

reactions

If you are a third party that cannot see the table, but you were subscribed the events that immutably have occurred about the table, you could determine the current state of the table.

reactions

A ball was placed on the table, and then removed. The current state is an empty table.

reactions

This sort of event based architecture is known as an “Eventually Consistent” system.

reactions

The third party does not know instantly as soon as the ball is placed on the table, however, it receives a message stating that the event occurred. Once the message has been received, the receiver can determine the new state of the table.

reactions

It’s a very popular pattern as well. Largely because of CAP theorem which stands for Consistency, Availability, and Partition Tolerance. The rule is you can only pick two. Different parts of the system can optimize for different goals, and the system at large generally sacrifices on consistency.

reactions

For example, it’s ok if you didn’t know Billy posted that rad new Instagram until 23 seconds later. You eventually get it.

reactions

Although I’ve been building services like this for years, it’s recently making another round the trends as “event-driven architectures”.

reactions

It’s also one step away from CQRS. All you need is to subscribe to a few event streams and create a projection of the data that is suited for the application at hand. The process is called denormalization, and hence I call services that do this “denormalizers”.

reactions

Which reminds me of a funny story: One time somebody on the slack where I hang out, somebody was asking about CQRS and I accidentally wrote “demoralizer”. He was like “is there seriously thing called a demoralizer”. 🤣

Do yourself a favor and read this still relevant article from 2012: The Log: What every software engineer should know about real-time data’s unifying abstraction | LinkedIn Engineering by Jay Kreps co-creator of Kafka.

reactions

3. Automate your operations and infrastructure

DevOps — the intersection of Development and Operations — is in a renaissance.

reactions

As I mentioned earlier, with the simplicity of services, some complexity necessarily moves into your operations and architecture.

reactions

I knew I wanted to build highly scalable fault tolerant systems made up of simple services that were maintainable, long-lived, malleable, and easy for humans to understand.

reactions

I knew microservices and Domain Driven Design patterns would allow me to deliver on all of those goals as well as allowing myself and my team to consistently deliver high quality code that could stand the test of time.

reactions

However, just running them in production turned out to be a pretty monumental task.

reactions

I was new to DevOps and didn’t even know where to start.

reactions

After googling, I came across a ton of AWS courses, there were five different levels of certifications, and each took months of time and hundreds of dollars in lessons.

reactions

This was no small order. I knew the de-facto standard was to become an AWS Solution Architect… problem was, that was the 5th level of certification for AWS, and until this point, I basically only deployed to PaaS providers like Heroku and Modulus, or someone else had handled DevOps.

reactions

So finally after spending years figuring out enough of the whole DDD, and CQRS/ES+AR thing, I still had to become an expert in another entirely different subject area just to be able to do it effectively.

reactions

Remember those CSS sliding doors I talked about? The ones that were about 100 lines of code to make a tab in HTML and CSS with rounded corners.

reactions

Luckily, things get simpler over time.

reactions

It’s easier than ever to get dangerous with DevOps. Read about My Journey to DevOps Bliss, Without Useless AWS Certifications and make sure to grab your free three week email course at the end!

reactions

Conclusion

That’s all for today! Thanks for reading. If you have any questions, or if you’ve found this helpful I’d love to hear your thoughts in the comments.

reactions

The best way to help me reach others is sharing on social media!

reactions

Best,

Patrick Lee Scott

reactions

P.S. Jørn André Myrland asked a great question in the comments — make sure you check out my answer for a better higher level picture. (https://medium.com/@patrickleet/glad-you-asked-1c5229ee3af6)

reactions

Want to learn more about me and my story? Click here to read about me and how my agency, Unbounded, can help you build distributed systems.

reactions

Tags