Written by Dennis van der Stelt and David Boike on March 12, 2019

In our family it’s a tradition that you get to decide what we’ll have for dinner when it’s your birthday. On my daughter’s last birthday, she picked pizza. I took her to the nearby pizza shop to decide what pizza to get.

A large screen dominates one wall of the pizza place, showing each order as it progresses through each stage of preparation. As I was looking at the screen, I noticed some names suddenly switched. Some pizzas with fewer toppings could be placed in the oven faster, and some would take longer to bake than others. In various steps towards putting the pizza in its box, the process could take longer depending on the pizza. My daughter’s pizza required additional preparation time, so other customers were able to leave before we were. In short, pizzas were not being delivered in the same sequence as they were ordered.

Is ordered delivery a requirement?

Just as at the pizza shop, we might intuitively think that certain processes require in-order processing. But in real life, this is usually not true. There might be a lot of scenarios that don’t fit with what we expected. In real-life scenarios, the business always adapts.

For example, imagine a payment arrives for an order we haven’t received (yet). We could simply return the money, or we could wait a while for the order to show up. In another scenario, say a product is sold just after we ran out of inventory. We could automatically cancel the order. A better option is to automatically re-order stock and let the customer know it’s been back-ordered. Or offer them a coupon for a different product.

The point is: our businesses can adapt to out-of-order delivery of information so our software should be able to as well.

To attempt to apply strict in-order processing would be to impose artificial limitations on our system. That’s because to guarantee message ordering is technically very difficult and, even if successful, always comes with tradeoffs like lower message throughput and less scalability that hamper the system’s ability to be successful. Consider our earlier pizza parlor and how many more orders it’s able to process by filling them out of order based on how quickly certain pizzas can be prepared, rather than solely on when the order was placed.

Let’s have a look at why it is difficult from a technical standpoint to guarantee ordered delivery.

Exceptions

What happens when programming exceptions occur in message processing code? Even with the most robust code possible, exceptions can still happen. As developers, this is not unfamiliar to us. After all, it’s why we write unit tests: to guard ourselves against the unexpected. But not everything is under our control, and a lot can go wrong.

We can have messages that throw an exception because of a transient error. And although not all errors are severe, we do need to deal with this. A lot of the time we can simply requeue the failed messages to solve the issue. But we should also be able to deal with “poison” messages, those that keep failing and should be put aside for a moment to be retried at a later time.

Whether we retry a poison message within a few seconds or days later is irrelevant. The point is that the poison message has been dealt with and another message can now be taken from the queue, one that was supposed to be processed after the poison message that is being retried later. The result is that messages we expected to arrive in order are now being processed out of order.

Scalability

When a system is built with messaging as one of its foundations, the ability to scale out is hardly an afterthought. Unfortunately it makes ordered delivery virtually impossible to support.

The ability to scale out is a very powerful feature. Instead of scaling up, where you buy more powerful and expensive hardware, you scale out by having more servers processing messages. Every server basically competes for messages, doing their best to process as many as they can. Going back to our pizza place, this is similar to buying more ovens to bake more pizzas at once rather than upgrading the existing ones to process the same number of pizzas faster.

Although the servers can scale out, they have no knowledge of each other. This usually isn’t an issue, except with ordered delivery. Messages that need to arrive in order might be processed on different machines. One of those machines could have less work or finish work more quickly, resulting in messages being processed out of order. Even if you don’t scale out and only have a single server, your server must process messages using a single thread (and thus slower) for the same reason.

Back to the real world

When things happen out of order in the physical world, things usually work themselves out through various checks and balances. I may have ordered my pizza before someone else but if their pizza is done first, people don’t stand around wondering what to do. I simply stand aside and let them take their pizza with no lasting harm done, aside from maybe a jealous stare at them for getting their pizza first.

Sometimes I’ll order a pizza for carry-out on my way home. When I place the order, the shop will tell me roughly how long it’ll take. If I get there early, I’ll pay for it and wait until it’s done. But maybe I get delayed in traffic or wait until the end of that Game of Thrones episode I’m watching for the tenth time. By the time I get there, my pizza is waiting for me so I pay for it and take it with me.

The important part of this scenario is that two events—the pizza being ready and the pizza being paid for—might happen out of order. But both need to be completed before the pizza can be delivered. In system modeling terms, we might have a DeliveryService that depends on an OrderPaid message from a PaymentService and a PizzaPrepared message from a KitchenService. If it gets the OrderPaid message first, it can’t deliver yet because it doesn’t know if the order has been prepared yet. In this case, you can imagine the customer constantly pinging the DeliveryService (i.e. the cashier) at regular intervals to see if it’s finished yet.

Modeling out-of-order messages in software

Instead we can use an NServiceBus feature called sagas. These are message-driven state machines that allow us to orchestrate business processes. Sagas automatically store state, deal with concurrency, and can help us orchestrate long-running business processes.

Let’s have a look at how a saga in NServiceBus deals with messages arriving out of order. With a little state it can remember what already happened and act based on those constraints. For simplicity of the code we’ll use two flags.

class DeliveryPolicy : Saga { public Handle(OrderPaid message) { Data.OrderPaid = true; VerifyIfPizzaCanBeDelivered(); } public Handle(PizzaPrepared message) { Data.PizzaPrepared = true; VerifyIfPizzaCanBeDelivered(); } private VerifyIfPizzaCanBeDelivered() { if (Data.OrderPaid && Data.PizzaPrepared) { // ... send message that pizza can be delivered } } }

In this example, when either message arrives, the state of the saga is altered. It then checks this state to see if it should continue with delivery. Let’s assume the PizzaPrepared message arrives first. It will mark the order as having been prepared and then check to see if all the conditions have been met. They haven’t so the saga goes back into a holding pattern until the OrderPaid message arrives. At this point, VerifyIfPizzaCanBeDelivered determines that all conditions have been met and we can continue with the order.

But what if the OrderPaid message arrives first? Perhaps the KitchenService is backed up with orders and hasn’t finished it in time. In this case, the saga does virtually the same thing. It marks the order as paid and then checks its internal state to see if all conditions have been met to continue. They haven’t so again, the order sits until a PizzaPrepared event arrives and completes the requirements to deliver the pizza.

Sagas provide a tool to solve all these ordering issues, as well as taking care of all the technical considerations like scale-out and optimistic concurrency, allowing you to focus on the business requirements instead.

Summary

From a technical perspective it is nearly impossible to have ordered delivery, deal with errors, and have a scalable system. At the same time it is very unlikely that a business process actually requires ordered delivery. Both from the business and the technical perspective we need to be able to adapt to different scenarios. What we actually need is a way to deal with those alternative scenarios.

But messages will arrive out of order and they should be allowed to arrive out of order. Instead of trying to “fix” this, we can embrace it and instead ask questions and offer alternative flows. We shouldn’t accept a technical constraint that would force a customer to wait, while their pizza gets cold, just because some other customer ordered first.

If you’re ready to start dealing with out of order delivery of messages, be sure to check out our saga tutorial.

About the author: Dennis van der Stelt is an engineer at Particular Software and is very strict about making sure everything is in the order correct.