In pub/sub systems, Senders publish messages to intermediate message brokers or event bus, and subscribers register subscriptions with the intermediate broker, letting the broker fan-out messages to subscribers. Every Subscriber that subscribes to a topic receives its own copy of each message published. A single message produced by one publisher may be distributed to hundreds, thousands, or even millions of subscribers.

The publishers can produce various events on multiple topics, and consumers will subscribe to the topics that they are interested in. Across the multiple solutions available in the market that can be used to implement an event-data pipeline, they all share the common and most basic design principle:

Publishers and Receivers are decoupled.

Publish-subscribe advanced principles

Some more advanced design principles such as:

durable subscriptions

message delivery quality of service ( at-most-once, only-once, at-least-once )

highly-available

fault-tolerance

will depend on the chosen solution and should be part of the considerations when deciding the event-data pipeline backbone.

Some of the available tools that will enable you to incorporate some of these important design principles are Apache Kafka, AWS Kinesis, RabbitMQ, AWS SMS, Azure Service Bus Event Hub, Google Pub/Sub and Redis ( with Pub/Sub, Streams, Lists, or even Sorted Sets ).

Apart from the available design principles of each tool, you should also consider the overall volume of data of your data pipeline — if you intend to make usage of durable subscriptions, and message rate to expect at each moment. The event-data pipeline needs to handle a large volume of event data being produced by many sources of data in a reliable, and efficient manner.

A quick note on Publish/Subscribe or Event Streams for Event Data Pipelines

While in Pub/Sub messages are normaly*(see above) fire and forget and are never stored anyway*(see above), Event STREAMS work in a fundamentally different way. Stream abstractions are useful (from the developer’s perspective) for building responsive

distributed systems that support fault tolerance and scalability. You should ask yourself the following questions when deciding between Publish/Subscribe or Event Streams:

Does your application need to be able to retrieve historical events ?

? Does your application need to scale out RECEIVES ? — meaning adding groups of clients to cooperate in consuming a different portion of the same data stream.

? — meaning adding groups of clients to cooperate in consuming a different portion of the same data stream. Does your application requires Fine Grained Subscriptions , meaning to select the events at a finer granularity than just the topic-based approach?

, meaning to select the events at a finer granularity than just the topic-based approach? Does your application demands different Quality-Of-Service, depending on the topic you wish to receive events?

The majority of the available market tools commonly enable you to emplace both a stream processing platform or messaging/pub-sub system. Using the above questions and other more advanced features that you might require, should allow you to determine if a messaging Publish/Subscriber or Event Streams solution is most appropriate.

Pub/Sub in action with Redis