GitHub statuses made easy with FaaS

Using Azure Functions to automate your pull requests validation

If you have an open source project, the more you can automate around the pull requests review process the better, at least that’s what I believe.

So if you go through your checklist for expectations you have some things are obvious continuous integration which there already exist good services for, but some of your checks could be unique and currently manual — one my just recent automated check is the example in this post.

The Example

The example explain in this blog post will be validating so each commit in a pull request comes from an known GitHub user, it’s a quite easy check, but also easy to miss when reviewing yourself as the pull request itself will always originate from an known GitHub user — so just by looking at the pull request on GitHub it’s not an obvious issue.

The reason to do this could be i.e. if you want to automate attribution of authors in release notes, validate CLAs and so on.

The reason for this occurring often isn’t malicious intent, but rather that a to GitHub unknown email address had been used to do the commit, i.e. issuers corporate email address.

The easiest fix is just to add all email addresses under their GitHub Settings/Emails, the other option is to amend the commit with the desired email address and do force push with the new commit.

The Concept

FaaS — Functions as a Service — is very well suited for these kind of scenarios as you don’t need a web site/server up and running 24/7, but instead FaaS enables you to have code that’s only up and running when invoked — where the idea is paying only for what you use when you use it.

Roughly the flow will be GitHub triggers a webhook call to a function, the function will parse out what’s needed from the GitHub payload and place it on an Azure Storage Queue, the queue will trigger a function that does the validation and reports back status to the pull request using the GitHub rest API.

New Function

I usually use the Azure Functions CLI scaffolding, write my functions in VS Code and then deploy via KUDU or CI/CD of choice. But the Azure portal is quite capable too and can be a quick way to get started and it’s fully possible do get the code to source control later (source control should always be the way when in production, but when getting started, prototyping and playing around the portal is actually quite productive)

Azure Functions has a ready template to quickly get started with a GitHub webhooks.

This template will give you a good starting point to dynamically parse the data posted from GitHub, making easy to support most scenarios.

Adding a queue

So now we got data setup to come in to the service and got the ability to parse the incoming data. Now we just want to quickly cherry pick what’s needed out of the GitHub payload, push it to a queue for later processing and tell GitHub we got it.

Go to “Integrate” to add a queue output via the portal, click

“+ New Output”, in this case we want an Azure Storage Queue

You then give the “queue output” a parameter name which will be the name of the output parameter in your function, then name of the queue to push messages to and last choose the appsetting where your storage queue connection string resides.

Hit save and our queue parameter binding is ready.

Filling the queue

Now we just need add it as an output parameter to our function so we can assign content to it:

As you might see here using output parameter you lose the async/await capabilities, if you don’t need multiple outputs and just want to return

“200 — OK” if processed and queued successfully, then you can keep using async/await by going under “Integrate” removing the HTTP result binding and setting the queue to bind to the return value of the function.

So a complete function that parses and queues while still supporting async/await could look something like this:

Typed parameter

A neat thing is that the queue item can be changed to custom class and then it will automatically serialize and deserialize the JSON string.So lets create an class and to keep it nice and tidy we’ll add it to it’s own file.

Hit “View Files”->”+Add”, then type in the filename hit enter.

The class to our payload would look something like below

We can then in our function run.csx use the #load directive like

#load "GitHubPayload.csx" , it’s available to us in our function and we can change our function method signature to a typed return value and end up with something like below:

So what does the above do? Not much and a lot, it parses the GitHub pull request event’s action, URL to post status to and URL to fetch pull request commits.

As we bound the return value to the queue it’ll be automatically serialized as JSON and queued.

Pop the queue

Now we need something to pop the queue and do the actual work.

To get up and running quickly Azure Functions actually offers an scaffolding option that will get you up and running quickly.

If you go to Integrate and the queue output we previously created there’s a “Create a new function triggered by this output” action .

This will pre-select a C# Queue Trigger, fill in the queue name and the storage account connection used.

All you need to do his hit create, but you probably want to name your function more meaningful than QueueTriggerCSharp1 too.

The function crated will look something like below.

It gets triggered when something gets enqueued and the dequeued value is what you get in and the template example just outputs it to the trace log.

Parse & Validate

Also the dequeue trigger can be typed by reusing the class we previous created, this way Azure Functions will automatically deserialize the queued JSON into an typed object we can easily work with.

The above code will validate that it’s an event/action we’re interesting and make sure we’ve got status API URL needed to report back status and commits API URL needed to fetch pull request commit details.

GitHub authorization

Now we got queue item parsed and ready, next step is to fetch the commits from GitHub, before we can do that we need to be able to authenticate against them.

GitHub has several ways of authenticating against their APIs, one way is thru personal access tokens, which is what I’ll be using for this blog post.

You’ll find your personal access tokens by going to “Settings” and then click on “Personal access tokens” in the left menu.

Then click the “Generate Token” button, give the token a description, choose the required access, there’s loads of available permissions, but for this post “repo:status” is the only access needed

Hit “Generate token” and you’ll be presented with the newly generated token.

App settings

We don’t want to store our token in our source code, fortunately Azure Functions just as the other Azure app services provide means to store app settings.

You find it under “Function app settings” -> “Configure app settings”

There you have a list of key value pairs, which you can add, delete and modify both app settings and connection strings.

All app settings & connection strings can be accessed as environment variables, which makes it easy to access regardless of which of the languages supported by Azure Functions you choose to use.

Connection strings environment variables are prefixed with it’s provider, i.e. if you choose the custom provider it’ll be prefixed CUSTOMCONNSTR_ , so if you enter a variable named GITHUB_TOKEN you’ll access it as CUSTOMCONNSTR_GITHUB_TOKEN .

Calling GitHub

So now we got our token and app setting in place we want to get and iterate over pull request commits, validate each and then post to if it passed or failed validation. The .NET HttpClient is used to call to the GitHub API, for convenience and to keep it nice and tidy I’ve wrapped it in two helper methods GetObjectAsync and PostObjectAsJsonAsync , both utilizing the same method for common headers and authentication, all placed in it’s own HttpClientHelper.csx file

Finishing the puzzle

Putting all the pieces together we end up with a function like this

And we now we’re feature complete, ready to queue GitHub webhook request, validate input, validate and report that commits has an registered GitHub user as author.

Configuring webhook

Now we just need to configure GitHub to call our function when a pull request is created or modified. This is done under repository (or organization) settings “Webhooks”->”Add webhook”

You can under your GitHub webhook function find the URL and secret to be used. Enter these as Payload URL, Secret and select application/json as Content type

Choose to select individual events and select only the pull request event

then just hit “Add webhook” and your done.

Testing first PR

Now we create our first PR to the repository we’ve configured our webhook on.

If all goes to plan, you should see that 1 author validation has been performed and as I’m a known GitHub user the test passes and all’s OK.

Now we’ll add a commit from an unknown user, easiest way to test this is by overriding the author when doing the commit. From command line that would look something like this

Then we push our changes, it’ll trigger the GitHub pull request synchronize event and our functions

Pressing the details link will take us to the commit displaying the unknown author.

Conclusion

It probably took a lot longer to read this post than it would take to get the functions up and running — as with many things it’s easy when you know it. It really lets you focus on what you want to solve and as I’ve shown combining with third party APIs is super simple which makes it really powerful. Working with queues is a breeze and for incoming web hooks that’s a great approach as you just receive and acknowledge — doing the actual work in a separate function, you can hook up functions to a poisonous message queue for each queue making it a very robust solution — without turning it too complex.

On a final note extending your GitHub pull request review process with small automatic checks is a great way to not only save you time, but also speed up the communication towards your contributors — and the sky truly is the limit on what you can check! You can also do bot like things like certain maintainer comments triggering a integration test, creating new issues, a Microsoft Teams message, etc.

So I think for these scenarios the functions as a service approach fits like a glove! What do you think? Would love to hear your feedback!

The Code

The complete code for the functions in this blog post can be found on GitHub

Previous posts

If you want to learn more Azure Functions please checkout some of the previous posts I’ve done on the topic!