First check out the result of our analysis on Oscars 2015 on Twitter

The 2015 Oscars had just finished, and we were interested in learning what was hot and what flopped at the Oscars. We decided to look at Twitter to learn what was going on, and as technologists, we picked only the best technology available today for Data Analysis, Google Cloud Dataflow.

Our plan was to provide a graphical representation of the relevant hashtags and mentions collected during the Oscars. To do so we need to consolidate the raw data in order to create the graphs, and we need to analyze the raw data to find the relevant entities to display. The first task is done easily with Dataflow so let’s start with it; we will then build on this code to get the analysis step.

(Google Cloud Dataflow is still in it’s alpha stage but you can apply to have access on this page. Once you have an account, the setup process and the tutorials are pretty helpful to learn how to use it).

Here’s what we did with Dataflow.