Accessing BigQuery from App Engine

Authenticating API calls with OAuth 2.0 Service Accounts

A Google Cloud Platform customer asked me today how to list all the BigQuery datasets available from a Google App Engine app.

If you don’t know what BigQuery or App Engine are this post is probably not for you … yet! Instead you should have a look at the docs for BigQuery, and App Engine.

BigQuery and App Engine are two of my favorite products of Google Cloud Platform.

The solution for this is quite simple, but I think there’s enough moving pieces than a blog post was required. Let’s assume that the list will be displayed as part of some the handling to a request, something like this:

func handle(w http.ResponseWriter, r *http.Request) {

// create a new App Engine context from the request.

c := appengine.NewContext(r) // obtain the list of dataset names.

names, err := datasets(c)

if err != nil {

http.Error(w, err.Error(), http.StatusInternalServerError)

return

} w.Header().Set("Content-Type", "text") // print it to the output.

if len(names) == 0 {

fmt.Fprintf(w, "no datasets visible")

} else {

fmt.Fprintf(w, "datasets:

\t" + strings.Join(names, "

"))

} }

Every time a new HTTP request comes the handler define above will be executed. It will create a new App Engine context that will then be used by the datasets function to return a list with all the names of the BigQuery datasets visible by the App Engine app. Finally, the list will be printed to the output w.

In order to register the handler so it is executed on every HTTP request we add an init func.

func init() { // all requests are handled by handler.

http.HandleFunc(“/”, handle) }

Now for the interesting part, how do we implement the datasets function?

func datasets(c context.Context) ([]string, error) { // some awesome code ... }

Let’s implement the body of that function in three parts:

1. Create an authenticated HTTP client

Given a context (context.Context) named c we can create an authenticated client using this code.

// create a new HTTP client.

client := &http.Client{

Transport: &oauth2.Transport{

Source: google.AppEngineTokenSource(c,

bigquery.BigqueryScope),

Base: &urlfetch.Transport{Context: c},

},

}

In Go, HTTP clients use transports to communicate, and transports use … transports! It’s transports all the way! But what is a HTTP transport exactly?

The Transport field in an HTTP client is of type RoundTripper:



RoundTrip(*error)

} type RoundTripper interface {RoundTrip(* Request ) (* Response

A RoundTripper is responsible for generating a response given a request, but it also has the capacity of changing the request and response being set. This is very useful for monitoring, instrumenting, and also authentication.

The oauth2.Transport type, given a request, adds authentication headers and forwards the request through its base transport.

&oauth2.Transport{

Source: google.AppEngineTokenSource(c, bigquery.BigqueryScope),

Base: &urlfetch.Transport{Context: c},

}

The snippet above creates a new oauth2.Transport that authenticates requests using the default service account for the App Engine project, and then uses an HTTP transport provided by urlfetch — the way of accessing external resources from App Engine apps.

2. Create a BigQuery service and list all datasets

Creating a BigQuery service with “google.golang.org/api/bigquery/v2” is quite simple. Just import the package and create a new service given an HTTP client with bigquery.New.

bq, err := bigquery.New(client)

if err != nil {

return nil, fmt.Errorf("create service: %v", err)

}

It’s important to check that the error is not nil, since the operation could fail as any other operation over the network. Once we have the service, we can list all the datasets visible from this project by following the documentation:

// obtain the current application id, the BigQuery id is the same

appID := appengine.AppID(ctx) datasets, err := bq.Datasets.List(appID).Do()

if err != nil {

return nil, fmt.Errorf("could not list datasets: %v", err)

}

3. Create a list with the project names

We got list, a bigquery.DatasetList, so we can iterate over all the projects in Projects and append their Id to our list before returning it.

var ids []string for _, d:= range datasets.Datasets {

ids = append(ids, d.Id)

} return id, nil

The resulting code is available on GitHub.

I hope this will be helpful to many of you! Feel free to ask questions or send suggestions on Twitter!