Realtime Streaming Analytics ( static queries given once that do not change, they process data as they come in without storing. CEP, Apache Strom, Apache Samza etc., are examples of this. Realtime Interactive/Ad-hoc Analytics (user issue ad-hoc dynamic queries and system responds). Druid, SAP Hana, VoltDB, MemSQL, Apache Drill are examples of this.

Simple counting (e.g. failure count) Counting with Windows ( e.g. failure count every hour) Preprocessing: filtering, transformations (e.g. data cleanup) Alerts , thresholds (e.g. Alarm on high temperature) Data Correlation, Detect missing events, detecting erroneous data (e.g. detecting failed sensors) Joining event streams (e.g. detect a hit on soccer ball) Merge with data in a database, collect, update data conditionally Detecting Event Sequence Patterns (e.g. small transaction followed by large transaction) Tracking - follow some related entity’s state in space, time etc. (e.g. location of airline baggage, vehicle, tracking wild life) Detect trends – Rise, turn, fall, Outliers, Complex trends like triple bottom etc., (e.g. algorithmic trading, SLA, load balancing) Learning a Model (e.g. Predictive maintenance) Predicting next value and corrective actions (e.g. automated car)

Realtime analytics are hard. Every developer do not want to hand implement sliding windows and temporal event patterns, etc. Easy to follow and learn for people who knows SQL, which is pretty much everybody SQL like languages are Expressive, short, sweet and fast!! SQL like languages define core operations that covers 90% of problems They experts dig in when they like! Realtime analytics runtime can better optimize the executions with SQL like model. Most optimisations are already studied, and there is lot you can just borrow from database optimisations.

By Srinath Perera ( @srinath_perera ).I was at Strata+Hadoop World 2015 last week and certainly interest for realtime analytics was at it’s top.Realtime analytics, or what people call Realtime Analytics, has two flavors.In this post, I am focusing on Realtime Streaming Analytics. (Ad-hoc analytics uses a SQL like query language anyway.)Still when thinking about Realtime Analytics, people think only counting use cases. However, that is the tip of the iceberg. Due to the time dimension of the data inherent in realtime use cases, there are lot more you can do. Lets us look at few common patterns.Why we need SQL like query language for Realtime Streaming Analytics?Each of above has come up in use cases, and we have implemented them using SQL like CEP query languages. Knowing the internal of implementing the CEP core concepts like sliding windows, temporal query patterns, I do not think every Streaming use case developer should rewrite those. Algorithms are not trivial, and those are very hard to get right!Instead, we need higher levels of abstractions. We should implement those once and for all, and reuse them. Best lesson we can learn from Hive and Hadoop, which does exactly that for batch analytics. I have explained Big Data with Hive many time, most gets it right away. Hive has become the major programming API most Big Data use cases.Following is list of reasons for SQL like query language.Finally what are such languages? There are lot defined in world of Complex Event processing (e.g. WSO2 Siddhi, Esper, Tibco StreamBase, IBM Infosphere Streams etc. SQL stream has fully ANSI SQL comment version of it. Last week I did a talk on Strata discussing this problem in detail and how CEP could match the bill. Here are the slidesBio: Srinath Perera is a scientist, software architect, and a programmer that works on distributed systems.Original: srinathsview.blogspot.com/2015/02/why-we-need-sql-like-query-language-for.html