This is the best image I could come up with

Some time ago I tweeted about my tips for using BigQuery in the enterprise. For some reason people liked it and it got quite a lot of attention. So, in the interest of posterity, I wanted to make it easier to find these tips and hence the reason for this article.

I’ll continue to do this in future for other stuff too. Also, if you’ve got some of your own tips to add to this list, then please feel free to comment, and I’ll add them — but only if they’re as bad as mine.

✅ Export logs from all projects and services into one central BigQuery project/dataset for easy auditing and analysis. This has come in so handy for us over the years.

More info here.

✅ Enable billing exports to BigQuery and use Data Studio to visualise/track spend, and drill down on BigQuery usage/metrics. You can use this to find those pesky analysts who are running SELECT * statements!

More info here.

✅ Look into using BigQuery custom cost controls to help manage and predict costs if you’re using the on-demand pricing model.

More info here.

✅ Fine tune ACLs to user/groups roles & responsibilities and employ pattern of least privileges. Writing to production should only be done by service accounts. Use views to control how tables in production are read from.

More info here.

✅ Look into flat-rate pricing (>$40K spend) if your workloads are high-volume, and you want stable and predictable monthly bills. While this is probably only justifiable for very big enterprises, it’s still worth knowing about nevertheless.

More info here.

✅ If using Slack (or other decent IM tool), use its RSS feed integration app to hook into BigQuery’s release notes feed. This keeps the team frosty with the latest releases without having to keep visiting the website. No more missed announcements!

More info here.

✅ Federated queries to GCS and Sheets are an extremely powerful and useful feature of BigQuery. We use this feature mostly for product managers and the like, so that they can easily keep maintain and update mapping tables for their datasets — not everyone knows SQL unfortunately! Of course, it should go without saying that we keep them in source control and they are included as part of the build pipeline.

More info here.

✅ Speaking of source control, make sure you keep all your SQL in it too. That goes for views too. When creating or updating a view, it should be part of your build pipeline.

More info here.

More tips from people who read this article:

✅ Investigate integrating Google’s Data Loss Prevention (DLP) API with BigQuery to secure sensitive data. The DLP API has native support for BigQuery, Cloud Datastore and Cloud Storage.

More info here.

https://www.servian.com/gcp/