BigQuery and the Gdelt Project. Beyond dreams of Marketing Analysts. albertovpd Follow Feb 7 · 2 min read

Do you imagine a dataset, which is updated every 15 minutes, containing all the information on all public media of the whole planet?

Do you imagine studying the Sentiment Analysis of a company or product in any location, or worldwide, regardless the language?

Well, the first question is answered by the breathtaking Gdelt Project. A titanic project that covers definitely more than any project can use, storing information in 65 languages.

The second question in answered by Google. The whole Gdelt is available in BigQuery, a powerful environment covering all fields of working with Data: Collecting data, creating datasets, filtering them with advanced SQL Queries (and further implementations are expected, like looping and conditioning within the Query), using Machine Learning algorithms and Visualizing them exporting your results to Google Data Studio.

For advance requirements, you can connect your BigQuery to Google IDLE (Google Colab), keep working with Python, and export results again to Data Studio, to perform a more visual solution.

Using the mentioned tools we can achieve a full Sentiment Analysis on networks.

The study we show below is for a company dedicated to education on digital business (Data Analytics, Web Development, UX/UI and more) and several campuses worldwide.

Taking a glance to the features we are able to confirm the following:

The general perception of this company is really good as shown by tone .

. arf shows that found text is slightly not neutral (maybe due to personal publications, or with emphasis). Polarity is moderate, suggesting that texts found were not highly emotionally charged.

shows that found text is slightly not neutral (maybe due to personal publications, or with emphasis). is moderate, suggesting that texts found were not highly emotionally charged. sg_rf is moderate, which suggest there is not a belonging-to-group feeling.

is moderate, which suggest there is not a belonging-to-group feeling. Within the studied interval, pos_score remains above neg_score and are coherent with tone. For some reason September and November 2019 were the roughest months for this company. It could be due to lack of presence on the internet, nevertheless the neg_score remains there, what implies that the good perception the people have just decreased this months.

So, this was done as a brief and simple sketch. The possibilities of BiqQuery itself, or BigQuery with the Gdelt Project will stack overflow all expectations of any analyst by far. 35GB of plain text were processed in this study.

The interactive graph can be found here.

Hope you liked it,

Alberto.

Member of the Data Team at Labelium Spain.