Using my new super powers to automatically import stock prices into BigQuery, I went out to find what were the most correlated Wikipedia pages with GOOG’s stock price during July 2016. Would you be surprised if I told you it was the list of Pokemon Indigo episodes?

SELECT CORR(a.req, b.close) corr, title, COUNT(*) c, SUM(req) requests

FROM (

SELECT SUM(requests) req, title, COUNT(*) c, DAY(datehour) day, COUNT(*) OVER(PARTITION BY title) days

FROM [fh-bigquery:wikipedia.pagecounts_201607]

WHERE requests > 5

AND language = 'en'

GROUP BY title, day

HAVING c=24

) a

JOIN (

SELECT close, DAY(date) day

FROM [fh-bigquery:public_dump.goog]

WHERE MONTH(date)=7

) b

ON a.day=b.day

WHERE days>22

GROUP BY title

HAVING c>18

ORDER BY corr DESC Query complete (20.6s elapsed, 305 GB processed)

The most correlated Wikipedia pages with GOOG stock price July 2016

To get the numbers to draw the time series for the chart:

SELECT a.day day, req List_of_Pokemon_episodes, close goog_close

FROM (

SELECT SUM(requests) req, title, DAY(datehour) day

FROM [fh-bigquery:wikipedia.pagecounts_201607]

WHERE title='List_of_Pok%C3%A9mon:_Indigo_League_episodes'

AND language = 'en'

GROUP BY title, day

) a

LEFT JOIN (

SELECT close, DAY(date) day

FROM [fh-bigquery:public_dump.goog]

WHERE MONTH(date)=7

) b

ON a.day=b.day

ORDER BY day

Warning 1

The correlation is funny, but spurious. With only 20 elements in a series, is easy to find highly correlated series within a list of hundreds of thousands of Wikipedia pageviews daily time series. But it’s fun :).

Warning 2

The above query goes over 305 GB of Wikipedia pageviews. This is within the monthly free terabyte query in BigQuery — but if I wanted to play more with this, I would extract first a summary of the pageviews I’m into a way smaller table.

More?

Want more stories? Check my medium, follow me on twitter, and subscribe to reddit.com/r/bigquery. And try BigQuery — every month you get a full terabyte of analysis for free.

Disclaimer: I’m a Google employee and nothing in this post constitutes a recommendation on whether to buy, sell or hold shares of any particular stock.