November 13, 2019

This is a short write-up on my current set-up for deploying this site. It covers:

building Haskell projects with Github Actions, including caching

generating and deploying a static site via Github Actions

a bit of a review of Github Actions

The manual workflow

The site is generated using the Hakyll static site generator. Using it involves writing a small Haskell program site that calls out to the Hakyll libraries to build the website.

This program and the rest of the website sources live in one repository, and are deployed to Github Pages by pushing to a second repository, robx/robx.github.io. The actual source repository is private, but you can checkout a snapshot at robx/site-demo.

Updating the site then takes four steps:

edit some website source files build the site executable run site to generate the website content commit and push the updated website to the destination repository

Doing this manually, we would build the executable using cabal build , then build the site using cabal exec site build , and finally do something along the following lines to deploy:

$ cd /src/robx/robx.github.io $ rm -r * && cp -r /src/robx/site/_site/* . $ git add -A && git commit -m "update website" && git push

The automatic workflow

Since doing this by hand is too much work and a bit error prone, let’s automate it using Github Actions. To do this, we define a workflow using a YAML file build-deploy.yml that lives in .github/workflows/ . You can view the full workflow file, or check the (messy) run history at the demo repository. Below, we’ll go through that file chunk by chunk.

Our workflow gets a name, is set up to execute on push events, and has one job with id build . name: Build and deploy to github pages on: push jobs: build-deploy:

Now comes the body of our build-deploy job. It gets a name as well as a base virtual environment. We also set some variables, since we’ll need to refer to the tool versions twice later on. name: Build and deploy runs-on: ubuntu-latest env: GHC_VERSION: '8.6.5' CABAL_VERSION: '3.0' These lines define environment variables that are available in commands that we run from this job, although we don’t use them as such. Additionally, they are stored in the env “context”, and thus available within the workflow below. Note that the env context does not include environment variables beyond those defined within the workflow.

Finally, we give a list of steps that the job should perform. First, we call out to a github-provided action that checks out the working tree corresponding to the event that triggered the workflow. steps: - uses: actions/checkout@master There are a number of packaged workflow components, available either from GitHub in the actions organization, or from others via the marketplace. The checkout action is defined in the repository actions/checkout.

Next, we call out to an action to install a Haskell toolchain. We specify versions for GHC and Cabal, referring to the variables using an adhoc expression language. - uses: actions/setup-haskell@v1 with: ghc-version: ${{env.GHC_VERSION}} cabal-version: ${{env.CABAL_VERSION}}

Building Haskell projects tends to take too much time: A fresh build of Hakyll on the Github Actions infrastructure takes around 30 minutes. So we’ll cache the compiled dependencies. We’ll be using Cabal’s Nix-style builds here, which store artifacts in $HOME/.cabal/store . The way caching works, we need to provide a cache key that includes full dependency version information. For simplicity, we’ll assume the existence of a Cabal version locking file cabal.project.freeze ; see below for another approach. - name: 'Run actions/cache@v1: cache cabal store' uses: actions/cache@v1 with: path: ~/.cabal/store key: cabal-store-${{ runner.OS }}-${{ env.GHC_VERSION }}-${{ hashFiles('cabal.project.freeze') }} restore-keys: | cabal-store-${{ runner.OS }}-${{ env.GHC_VERSION }}- cabal-store-${{ runner.OS }}- When run, this action will restore any existing archive under the given key, falling back to any of the alternate keys listed under restore-keys . In addition, the action has a “post action”, which will save an archive at the end of a successful run. When run, this action will restore any existing archive under the given key, falling back to any of the alternate keys listed under. In addition, the action has a “post action”, which will save an archive at the end of a successful run.

Now we’re ready to execute a couple of commands to build the Haskell project and generate the site: - run: cabal update - run: cabal build --only-dependencies - run: cabal build - run: cabal exec site build cabal update fetches the package database from hackage. (This package database might also be cached between runs, but at ~30s I didn’t bother so far.)

fetches the package database from hackage. (This package database might also be cached between runs, but at ~30s I didn’t bother so far.) cabal build --only-dependencies builds the dependencies only. It’s useful to split this from the project build itself below: While getting things to work, we can disable later steps in order to get the dependency cache ready, making iterating on the later steps a lot faster. We can easily distinguish between problems with the project itself and with the packaging infrastructure.

builds the dependencies only. It’s useful to split this from the project build itself below: cabal build builds the Hakyll site executable itself.

builds the Hakyll executable itself. cabal exec site build calls this executable, generating the website.

Finally, we check the commit the updated version of the site to the github pages repository, using one of a multitude of third party actions that deal with this task. This action in particular has the advantage of supporting ssh deploy keys, while most other actions appear to require a (far more powerful) personal access token to interact with a different repository. - name: 'Run peaceiris/actions-gh-pages@v2.5.0: deploy to github pages' uses: peaceiris/actions-gh-pages@v2.5.0 env: ACTIONS_DEPLOY_KEY: ${{ secrets.ACTIONS_DEPLOY_KEY }} PUBLISH_BRANCH: master PUBLISH_DIR: _site EXTERNAL_REPOSITORY: robx/robx.github.io if: github.ref == 'refs/heads/master' The if: condition ensures that this step is only run on pushes to master. It’s nice to have the rest of the workflow execute also on other branches, to be able to debug it easily. If we didn’t want that, we might instead have limited the workflow to run only on pushes to master by filtering in the top-level on: field, as follows. on: push: branches: - master To configure the ACTIONS_DEPLOY_KEY . The To configure the deploy key , generate an ssh key-pair, add the public key to the github pages repository as a deploy key, and store the private key in the source repository secrets under the key. The README of this action has detailed instructions.

Some extra snippets

Debugging

The following step is useful to have in there to aid in debugging a workflow. It stores a number of contexts to env , which is enough to be able to inspect them in the web interface.

- name: Dump contexts env: CTX_GITHUB: ${{ toJson(github) }} CTX_STEPS: ${{ toJson(steps) }} CTX_ENV: ${{ toJson(env) }} run: true

Caching with cabal.project.freeze

To get reliable caching with cabal regardless of the existence of a freeze file, you can reorder things as follows:

- run: cabal update - run: '[ -e cabal.project.freeze ] || cabal freeze' - name: 'Run actions/cache@v1: cache cabal store' uses: actions/cache@v1 with: path: ~/.cabal/store key: cabal-store-${{ runner.OS }}-${{ env.GHC_VERSION }}-${{ hashFiles('cabal.project.freeze') }} restore-keys: | cabal-store-${{ runner.OS }}-${{ env.GHC_VERSION }}- cabal-store-${{ runner.OS }}- - run: cabal build --only-dependencies

This generates an up-to-date freeze file and uses it to compute the cache key.

It’s necessary to get the version information into the cache key instead of just e.g. hashing the cabal file itself: Otherwise, the cache key will be constant across runs whence the cache won’t be updated, even as the cache gets outdated in relation to upstream.

As dependencies get updated, the cache will keep growing. This seems to be a general problem with dependency caching that I don’t see a good way around.

Github Actions pain points, open ends

This works and I’m happy enough with it. Getting to this state was quite painful though, and I’m not thrilled with Github Actions in their current state. Some random thoughts:

Storing the GHC version in env , and referencing this later, seems needlessly verbose. I’d prefer to be able to reference the inputs to actions/setup-haskell directly.

Generally, the variable handling is messy. I spent hours trying to get the documented HOME environment variable into the cache path, before finding out that ~ works. It turns out it is possible to get environment variables to the expression level, by doing the following (yikes!): - id: get-home run: | echo "::set-output name=home::$HOME" - uses: actions/cache@v1 with: path: ${{ steps.get-home.outputs.home }}/.cabal/store

The whole thing has a very ad hoc feel to it, with lack of overall design. The language is strange – why bring Javascript-like property and index syntax into the expression syntax? What’s with the weird type coercion rules? In general, it feels like a bit more distance to the Javascript sphere might have been beneficial.

I have doubts with respect to the trust model and third-party actions. I’m happy to trust the Github provided actions with my secrets, and I’m happy to trust e.g. peaceiris/actions-gh-pages@v2.5.0 with my deploy key after reviewing it. But I don’t see any guarantees that it won’t be replaced by a malicious version.

It’s easy to make typos in YAML field names that typically won’t give obvious errors. In general, getting to a working workflow is too much trial and error, due to a combination of a confused design and inaccurate documentation. The system seems best learnt by copying and modifying existing scripts.

I like the option of using and providing third-party actions. However I’m not so convinced right now by the design here. I don’t see a (straightforward) way to bundle up actions/setup-haskell and actions/cache to provide a one-stop Haskell setup. And neither dropping to the Docker level nor using NodeJS (why?) are appealing.

There are also some open ends on the Haskell side of things.

I’m building the site with stack locally, instead of cabal . I went with cabal here because that’s what actions/setup-haskell provides, but would prefer to set this up with stack .

I’m using a forked version of Hakyll via cabal.project, and that is getting rebuilt every time. It would be nice to figure out how to cache this.