Friday, November 1, 2019

Note: There will be no obvious Wu-Tang Clan song references in this post. Except for this one.

In my previous post: https://markphelps.me/2019/09/migrating-from-travis-to-github-actions/ I described my journey migrating my open source Go project, Flipt, to GitHub Actions from Travis.

While this has turned out great, one thing that I did find missing using Actions was the ability to use a dependency cache to speed up builds. Travis has had this ability for awhile now and I previously took advantage of it when I used to run my builds for Flipt on Travis.

Well it turns out that the GitHub Actions team has quitely released the ability to save and restore cached dependencies using Actions!! The github/actions/cache repository explains how you can setup caching as part of your workflows.

Note: As of this writing, github/actions/cache is in a preview release state, which means syntax and functionality could change before v1.

I just discovered the cache action yesterday and was eager to try and set it up in my Flipt workflows.

Pre-Cache

Since Flipt uses Go Modules and does not vendor its dependencies, each build run using GitHub Actions would previously have to download all required modules from the Go Module mirror or GitHub itself:

This was pretty wasteful as the modules used don't change all that much, but they were still being downloaded each time regardless. This also meant that the build was dependant on the speed that these modules could be downloaded, along with their availability.

Build Cache to the Rescue

At the time of this post, there were no clear instructions in the actions/cache/examples README for Go, so it took a little trial and error to get Flipt and Go Modules working with the cache action.

Here's the end result:

The required parts are:

Setting the path correctly Choosing the right cache key

Setting the Path Correctly

In order for the cache to work, you need to tell it which files you want cached. In our case we want to cache all of the Go modules that are used in our build. This brings up the important question:

Where are Go module source files stored?

I found the answer after some googling.

spoiler: modules are stored at $GOPATH/pkg/mod .

It seems the $GOPATH is still useful for something!

With this new knowledge, I first tried setting path to:

path: $GOPATH/pkg/mod

Which didn't work. It seems that Actions wasn't able to resolve $GOPATH here. Perhaps this was a misconfiguration or overlook of something on my part, but I looked for a different solution regardless.

I needed to find a solution that didn't depend on environment variables to set the path correctly. I came upon this (long) issue in the action/setup-go repo which is used to well, setup Go in Actions.

This lead me to this comment which states that:

Recent Go versions already default this to $HOME/go

So, with that in mind I updated my action to set the path to:

path: ~/go/pkg/mod

And all was well!

Choosing the Right Cache Key

The last part of setting up a successful caching strategy is to select the right cache key. The cache key is used to determine when we can rely on the files cached and when we cannot.

Simply put:

You want to use the cache whenever possible, but ‘bust’ the cache when dependencies change.

In Go's case, we can depend on the go.sum file to tell us when any of our dependencies change, since it lists all modules with their versions, along with hashes of their contents.

Thankfully, the action/cache action provides a nice helper function hashFiles() which will create a MD5 checksum of the contents of whatever file or directory you give it. This is exactly what we need to generate a good hash key to use as our cache key since any change to the go.sum file would result in a new MD5 checksum value.

I updated key to:

key: ${{ runner.os }}-${{ hashFiles( '**/go.sum' ) }}

This will generate cache keys that contain:

The OS that the action is running on (Linux, Windows, MacOS, etc) The MD5 checksum of the go.sum file.

Note: It is important that **/ prefixes go.sum in the argument to hashFiles() in order for Actions to find the file.

Results

After all was configured properly, I pushed a new commit and was able to see that actions/cache was working as expected!

A couple things to note:

go test no longer had to download all the modules before running, which was the point of all this. The build was slightly faster than the no cache build, however this small speedup could be attributed to many things beyond our control such as build machine usage during the time of our runs.

Wrap Up

I expect that build durations will show a clear decrease overtime with the use of this new cache functionality.

To me, the addition of caching to GitHub Actions means that it is fully capable of replacing most of the products from existing CI providers and I look forward to what the Actions team has in store for the future!

Like this post? Do me a favor and share it on Twitter