Human beings tend to filter out events they deem unimportant. They can only process so much at any given time.

Computer systems, however, must be able to handle a massive number of events in real time or near-real time to help support a wide range of applications. Financial applications must monitor events to help counter fraud. Retail apps use online shopping events to capitalize on growth opportunities. Manufacturing production lines and energy systems use event data from Internet of Things (IoT) devices for automating faster, smarter responses to changes that demand attention. These are just a few examples where event processing can potentially be applied with great effect.

A different kind of data store

With the the high velocity, volume and variety of data that events can generate, an event data store must be able to deliver:

Fast data ingest rates (inserts)

In-memory indexing for fast and efficient lookups

Near real-time analytics on all ingested data with online analytical processing (OLAP)

Integrated machine learning capabilities to “learn” from previous events

High availability and replication to provide continuous value to the business

Linear scalability by simply adding nodes

Support for open storage such as Apache Parquet to minimize vendor lock-in

Support for hybrid cloud configurations to match with appropriate workloads and service-level agreements

Three key characteristics are necessary to deliver an effective process and data store:

An event store architecture.

1. Ingest

Car sensors, home appliances, credit card purchases, mobile transactions and aircraft flight systems generate volumes of events, many at high velocity. This necessitates the efficient ingestion of vast amounts of data, on the order of a million inserts per second1, through numerous messaging systems such as Kafka, Spark, IBM Streams, and other vendors’ streaming solutions. Support for industry standards and open APIs such as Scala and Python are necessary to make use of existing and available skill sets and democratize event streaming and processing.

2. Analytics

To access the latest ungroomed data, queries must be able to directly access the optimized event nodes and their associated cached data in the cluster. If minutes-old data is acceptable, then users should be able to access and query that stored hardened data using “vanilla” spark nodes and use compatible analytics tools of their choice.

Ultimately, queries should be able to retrieve the most recent data and combine it with groomed data in cache or in the storage layer.

3. Availability

High availability for all data, event store processes and stores is vital. Technology often fails when users least need it to. Data, as well as all associated log data, must be sharable through replication across nodes for redundancy reasons. Should a node-failure occur, queries must continue to be processed, so the configured number of query replicas must always be reachable as depicted below.

High availability in an IBM Db2 Event Store cluster.

Any event-streaming processing solution should provide sophisticated management and monitoring capabilities to help provide insight into the health of the system.

Event-driven AI with machine learning

Consider the impact of AI, machine learning in particular, on event processing.

Machine learning is best facilitated through the availability of large quantities of data. A high-speed event store can capture, analyze and store more than 250 billion events per day. This can enable applying machine learning to the most recent data along with the historical data.

Each time an event occurs, a system could “learn,” process and react rapidly to events as they happen, subject to the processing capabilities of the systems. Event processing coupled with machine learning helps enable applications to become “aware” of what is happening, to the point it could potentially help predict when similar events might recur. They thus help protect against potential impending fraud or disaster by correlating previous events and outcomes.

There are many scenarios where this linkage between event processing and machine learning can be used, faster and more often than human beings can ever process. Having intelligent and “aware” event-driven systems augmenting one’s own capabilities is like having a personal, trusted adviser or assistant.