Introduction

Distributed systems has come a long way from where it started and have a long way to go. With more and more people connected through telecommunication technology over the largest distributed system on the planet which is the internet, the demand for an efficient distributed architecture is ever so important today. This is not a tutorial on distributed systems though. Rather some thoughts on building a futuristic distributed system which addresses the challenges of the modern needs keeping the focus at the messaging styles.

In a distributed system, messaging plays a pivotal role. Data flows from one system to another through messages. Different protocols and formats are used to share data within a distributed system as messages or events. Another key aspect of messaging is the nature of communication. Mainly, there are 2 styles of messaging.

Synchronous (request-response) messaging

Asynchronous (pub-sub, competing consumer, event processing, batch processing) messaging

Synchronous messaging

In layman’s terms, synchronous messaging means that the system which is sending the message expects an immediate response from the target system. Request-Response style of messaging is another name for the same.

Figure 1: Synchronous messaging (request-response with blocking)

As depicted in the above figure, source system sends a message (request) to the target system and waits (blocks) until it receives a response from the target system. Target system may process the message within itself or send the message to another system and generate a response within a short time period. To make this communication successful and make sure source system does not waste its resources in case of a target system failure, there is a timeout configured at the source side so that it will stop waiting for the response if the timeout is elapsed without receiving a response from the target system.

In real world implementations, we can observe that there are 2 mechanisms used to implement request-response style communications. Which means that, request-response style communication can be implemented in a blocking manner as well as non-blocking manner. The above example is a demonstration of a blocking style of request-response implementation. Instead of waiting for the response, source system can continue its work while registering a callback in the source system side so that whenever there is a response available from the target, this callback gets executed and the source system can continue processing the response. This is the common pattern of implementation of most request-response style systems because it does not blocks the resources (CPU) on the source side instead continue utilizing resources for other tasks.

Figure 2: Synchronous messaging (request-response non-blocking)

The above mentioned callback based synchronous messaging is the performance optimal messaging style. But it does not work for all the use cases. As an example, if there is a database transaction which spans across multiple subsequent requests, we should go with the blocking approach.

The latest improvement in the synchronous or request-response style of messaging is the full-duplex style of messaging where both source and target systems can communicate via the same connection at the same time. In both the above mentioned blocking and callback based synchronous messaging architectures, one system can communicate at a given time through the connection which we called as half-duplex or simplex. But this style of communication was not enough for modern web applications with high data demands. The protocols like Websockets and HTTP/2 are designed in this manner.

Figure 3: Synchronous messaging (full-duplex)

The above messaging style improved the performance of the web applications through various technologies like

Request multiplexing

Header compression

Server Push

Request prioritization

Asynchronous messaging

Sometimes data flows through systems at really high rates which is impossible to respond in a synchronous manner. On the other hand, these message sources does not expect an immediate response or sometime does not expect a response at all. The expectations of these type of messages are

Guaranteed delivery

Extensive processing

Correlation and time series analysis

Decoupling of source and target systems

There are different forms of asynchronous messaging architectures implemented for different use cases. Some of the commonly used techniques are mentioned below.

Topic based publish-subscribe

Queue based publish-subscribe

Event based real-time processing

Batch processing

Store and Forward

Let’s see how these models work in practice.

Topic based publish-subscribe

In this model, message producers and message consumers are decoupled through an intermediary messaging infrastructure. This intermediary component is called a topic and the messaging infrastructure which hosts these topics implements many additional features like partitions, replication, durability, etc. Apache Kafka is a perfect example of a topic based publish-subscribe messaging infrastructure. In this model, there can be more than one consumer who is subscribed to the topic and all the consumers will get at least one copy of the message which is published to the topic by the producer. Some implementations, remove messages from topics once it is consumed by subscribers while systems like Kafka keeps the messages for a period of time (which is configurable) so that subscribers can consume the message multiple times if necessary.

Figure 4: Topic based asynchronous messaging

Queue based publish-subscribe

In the above model, every consumer gets the same message. But in the queue based model, only one consumer will get a given message. It is also called as competing consumer pattern where consumers are competing to receive the message from the queue. This pattern is useful for sharing the message processing load across multiple consumers. Traditional message brokers like RabbitMQ and ActiveMQ implement this form of asynchronous messaging.

Figure 5: Queue based asynchronous messaging

Event based real time processing

In this model, message producers are not expecting a response but the events needs to be processed in real time. This type of messaging pattern is useful when there is a need to take immediate decisions based on the event data like fraud detection. There are many stream processing platforms which supports this form of asynchronous messaging. Some examples are WSO2 Stream Processor, Kafka streams, Apache Flink, Apache storm. Most of these event processing systems supports integration with Artificial Intelligence (AI) as well as Machine Learning (ML).

Figure 6: Event based real time asynchronous messaging

Batch processing

Another form of asynchronous messaging is the batch processing which is one of the oldest forms of messaging. In this model, messages are received and stored in a system until that is processed. This raw data is later processed through batch processes which runs periodically at a given time. One key advantage of this model is that the event processing can be done at the convenience of the system without worrying about the event reception time. These batch processing systems should be implemented in such a manner that when the volume of data is increased, the processing time should not increase. Incremental processing is a technique used to achieve this capability. Another key feature is the ability to automatically purge older data so that storage space is not wasted for redundant data which is already processed.

Figure 7: Batch processing asynchronous messaging

Store and Forward

Sometimes there are situations where messages needs to be processed as soon as possible. If we think about a scenario where there are 2 systems processing messages at different rates. System 1 is sending messages at a rate of 100 requests per second and the receiving system is only capable of handling at a rate of 50 requests per second. If we connect these 2 systems directly, there will be a clear message loss. In this type of a situation, we can use the store and forward style of messaging where system 1 will publish messages to a message store at the 100 rps rate and the system 2 will receive the messages at 50 rps from the message store. Intermediate message store will make sure every message is persisted until it is processed by the system 2. This style of messaging is supported in standard integration products like WSO2 Enterprise Integrator.

Figure 8: Store-forward asynchronous messaging

All the above mentioned messaging styles (and many other) exists today and implemented across many distributed system implementations. Some systems support a mixture of these messaging styles to give the best of relevant messaging styles.

Building a modern distributed system with messaging

Enterprises are growing their customer bases across the globe thanks to the internet which is the world’s largest distributed system. Due to the complexity of the business operations, enterprise IT infrastructure has many different systems catering all sorts of requirements. The communications from external customers to your enterprise ecosystem as well as communications within the enterprise perimeter is done using one of the messaging styles mentioned in the previous section. The selection of a proper messaging style is critical in designing a successful platform which we called as a digital platform.

When selecting a messaging style for a given communication link as well as for the overall system, we can consider the following (there can be more depending on the use case)

Expectation of the message (data) source (or producer)

Amount of processing needs to be done

Importance of data

The rate of message exchange

Let’s take a popular example and try to come up with an optimal architecture with valid messaging styles so that we can use that as a reference when designing distributed systems for different domains. Let’s consider an online trading company who wants to build a digital platform to sell items through various channels like mobile, web and partner stores.

Let’s consider the above mentioned factors against this use case of online trading platform.

Expectation of the source

In this case, the source is the consumer who is buying a product through a web or mobile application. The expectation of the consumer is to get the relevant information in a nicely formatted manner in the shortest possible time. If s/he wants to search for an item, the results should come within less than a second. The application interface should be designed in such a way that regardless of the number of search results, the interface should come up within less than a second. This requires a request-response style of messaging architecture since the consumer is waiting for an immediate response. In the meantime, modern rich web applications may fill out various sections of the web page with recommendations, offers, discounts along with the main content. The best technology for this type of requirements is full-duplex messaging style with Websockets or HTTP/2.

Amount of processing needs to be done

For most of the user operations, results will be fetching from a static data source like a database at the backend side. There can be an API layer which exposes this data to the web application. Based on the operation which user is performing, the amount of processing will be decided. But this processing is not much complex processing. If there are some activities which requires additional processing at higher rates like web site user tracking scenario where user activities are tracked and processed in real time to provide real time recommendations to the user, better processing is needed. Additionally, if there are requirements like fraud detection, an event processing system is required.

Importance of data

If a user is browsing for an item, loosing one message won’t be a big problem. But if the user is buying an item and if that message is lost, it will be a huge impact on the business. If messages need to be processed in a reliable manner, the messages need to be stored in a system first and then processed later.

The rate of message exchange

If this online trading business becomes highly popular and millions of people start using it through the web site, there will be a huge traffic comes to the system. This is the situation where most organizations struggle. In reality, there can be only 10% of people actually buying a product. But the remaining 90% would be just browsing through the website and not buy any product. At the initial stages of the interaction, there is no way to identify whether the user is going to buy the product or just browsing (some AI/ML might help for returning customers but with very less precision). It is evident that system designers should cater for 100% traffic with the same service level at early stages.

Architecture

Once the requirements are considered and the related messaging styles are decided, putting them into an architecture diagram gives you a proper understanding of the overall distributed system which you are developing. Interoperability of these different styles of messaging is required and some purpose built software is required to fill in. The below figure showcases a possible architecture focusing on the messaging aspect for the above online trading use case.

Business Architecture