I use Google’s BigQuery as my go-to data warehouse; it is insanely robust, very fast, easy to use, and super cost-effective. Most of the time I find myself querying BigQuery and loading the result into a dataframe using BigQuery’s python client library. Wouldn’t it be awesome if I could just type a SQL query into an IPython cell, and have the cell execute that SQL on BigQuery and store the result in a DataFrame? I’d want to be able to do something like this:

Turns out embedding SQL for BigQuery directly into notebook cells is possible in IPython via the use of cell magics. Cell magics are those commands that look like %%this which reads the contents of an entire cell and calls a registered magic function that two positional arguments, line and cell. The line argument contains the string of the line used to invoke the cell magic, e.g. the %%bq line. The cell argument is also a string that contains the rest of the contents of the cell, not including the line.

If we take the example above, the magic function bq would be called with "--name hn_daily_comment_counts" as the line argument, and "SELECT

...

day ASC;" — or the contents of the rest of the cell after the initial line — as the cell argument.

We can use the BigQuery python client along with the cell magic API to have the magic function query BigQuery and return the result.

As you can see above, I contain all of my magics (in this case just one) in a ResearchEnvMagics class. It’s important that this class inherits from IPython’s Magics class, as well as has the @magics_class decorator attached to it.

I also make use of IPython’s magic_arguments package in order to allow passing additional configuration parameters to my cell magic, similar to what you would use on the command line. This is a common pattern for IPython magics, hence they provide a built-in package for it, which is awesome!

The final step here is to register these magics with IPython via the extension load hook:

And that’s all there is to it! Now that the magic function is registered, it can be used in Jupyter notebooks to easily perform research and gain insights.

Of course, after building this BigQuery magic I found out by digging through the client library API docs one night that it already ships with IPython magics 😑. I wish that had been at all documented somewhere, but I enjoyed building it out nonetheless and learned a lot about cell magics in IPython. As a next step, I’d like to figure out how to add SQL syntax highlighting for the BigQuery cell magic.

I hope you found this useful, and if you use special custom IPython magics in your research environments, let me know in the comments! Would love to share information and tips. Check out below to see the extension in action on some example queries from BigQuery’s HackerNews dataset.