GitHub Pages

Do you work on or maintain a project for technical users? A key part of attracting users, especially to an open source project, is publishing great documentation. However, keeping it up to date as your APIs and concepts change can be challenging or just time-consuming.

A popular way to maintain great docs is to keep them in your project’s repo. Often they’re built from some kind of easy-to-edit source format (like Markdown) and rendered as HTML. Once you’ve built the HTML, where do you publish it? For open source projects on GitHub, a seemingly obvious choice is GitHub Pages.

GitHub Pages will automatically handle building Jekyll content for you. In my case, however, I want to generate my own HTML. First, I’ll show you what I set up in my own GitHub repo. At the end, I’ll walk you through building and publishing your own GitHub Pages using Azure Pipelines.

What we’ll need

We’re going to need:

A system for transforming content into HTML

Some source content

A repo on GitHub to hold this stuff

And we’ll build a pipeline for automating our publishing step.

Our content system

Markdown is an extremely popular source format for documentation, so is reStructuredText (at least if you’re into Python). This isn’t a post about Markdown or rST, though. In order to keep things generic, I’m going to invent the world’s silliest documentation system: all it knows how to do is take a directory of HTML files and replace the token “{{ NOW }}” with the current time. It’s a shell script like this:

#!/usr/bin/env bash # docs.sh ROOT=$(cd `dirname $0` && pwd)

SRC_DIR=$ROOT/src

DEST_DIR=$ROOT NOW=$(date) # if we don't have any HTML files, don't do anything

shopt -s nullglob

for f in $SRC_DIR/*.html

do

echo Processing $f

DEST_FILE=$DEST_DIR/$(basename $f)

# replace "{{ NOW }}" with the time this script started

sed "s/{{ NOW }}/$NOW/g" <$f >$DEST_FILE

done

And as for source content, we’ll start with just an index.html file:

<!DOCTYPE html>

<!-- src/index.html -->

<html>

<head>

<meta charset="UTF-8">

<title>Hello World!</title>

</head>

<body>

<h1>Hello World!</h1>

<p>This page was generated {{ NOW }}.</p>

</body>

</html>

Here’s what that looks like, side-by-side:

Side-by-side, before and after running docs.sh

Our GitHub Pages repo

If you aren’t familiar with it, GitHub Pages lets you push HTML content to a Git repo and have it automatically show up on an HTTP server. You can make Pages for a project, for yourself, or for an organization (with slightly different capabilities on each). I followed GitHub’s great tutorial on Pages from the command line to get started. My username is vtbassmatt, so I decided to make a user page for myself. My repo is called vtbassmatt/vtbassmatt.github.io.

Creating the repo

Because I’m publishing a user page, GitHub will publish whatever is on master. I also chose to leave the source of my content in master. This gives me a neat side-effect: the content for my page will be accessible on the web (at /src) as well as the “rendered” HTML.

Generated HTML being served up by GitHub Pages

The pipeline

The heart of the system is this Azure Pipelines YAML file:

# Publish GitHub Pages

# azure-pipelines.yml trigger:

- master pool:

vmImage: 'Ubuntu-16.04' steps:

- script: |

./docs.sh

git config --local user.name "Azure Pipelines"

git config --local user.email "azuredevops@microsoft.com"

git add .

git commit -m "Publishing GitHub Pages ***NO_CI***"

displayName: 'Build and commit pages' - task: DownloadSecureFile@1

inputs:

secureFile: deploy_key

displayName: 'Get the deploy key' - script: |

mkdir ~/.ssh && mv $DOWNLOADSECUREFILE_SECUREFILEPATH ~/.ssh/id_rsa

chmod 700 ~/.ssh && chmod 600 ~/.ssh/id_rsa

ssh-keyscan -t rsa github.com >> ~/.ssh/known_hosts

git remote set-url --push origin git@github.com:vtbassmatt/vtbassmatt.github.io.git

git push origin HEAD:master

displayName: 'Publish GitHub Pages'

condition: |

and(not(eq(variables['Build.Reason'], 'PullRequest')),

eq(variables['Build.SourceBranch'], 'refs/heads/master'))

This pipeline will trigger whenever I push to master and will run on the hosted Ubuntu agent pool. The first script step will run my silly doc generator, then check in the generated docs.

What’s that ***NO_CI*** token for? We’re eventually going to push this commit back to master. But recall that this pipeline triggers on pushes to master… which would lead to an infinite loop of pipelines running. The ***NO_CI*** statement tells Azure Pipelines not to trigger on this commit. (Azure Pipelines also understands a few other ways to skip CI for a commit.)

The next step is a task which downloads a file that’s been securely stored. That file is the private key of a GitHub deploy key. By presenting the private key, GitHub will allow my build agent to authenticate and push changes to the repo.

Generating a deploy key

Finally, the last script step pushes the commit back to GitHub. SSH is picky about file locations, directory permissions, and connecting to a host it has never seen before. The first three lines take care of getting the private key in the right place.

It’s worth nothing: Azure Pipelines has a native InstallSSHKey task. That would have handled downloading the secure file and adding the known_hosts entry. I opted to do this manually with shell scripts, mostly as a learning exercise.

The fourth line changes our push URL from https:// to ssh://, which will tell Git to present the SSH key. You’ll obviously want to change the values to match your repo.

Because of the way Azure Pipelines optimizes fetching Git repos, from Git’s perspective, we aren’t actually on the master branch. That’s why we have to use the refspec HEAD:master on the final line which calls git push.

That condition is a little wild as well. You can read it like a prefix-notation functional language (or an Excel formula, if you prefer): “Run this step only if the variable Build.Reason is NOT ‘PullRequest’ and the variable Build.SourceBranch is ‘master’.”

Following along at home

Now we have all the pieces in place. To replicate what I’ve done:

Set up your GitHub repo with the shell script, the source content, and the azure-pipelines.yml file. Make sure to edit the pipeline to apply to your GitHub repo. (Hint: your GitHub username is not vtbassmatt!) Install the Azure Pipelines app and go through the setup experience. Your first build will fail because you don’t have the secure file in place — that’s OK. Generate your deploy key and give the public half to GitHub. Give the private half of the deploy key to Azure Pipelines: Go to the Library on your Azure Pipelines organization and create a secure file called “deploy_key”. You’ll also want to click Edit on the secure file and check the “Authorize for use in all pipelines” box. Go back to GitHub and use the web editor to change files in the /src folder. Start a PR. The pipeline will run, but it will skip the step to push the built content to GitHub. Complete the PR. The pipeline will run again, this time as a continuous integration trigger to master. The resulting content will be automatically pushed back to master and ultimately deployed on GitHub Pages!

Giving the public key to GitHub

Giving the private key to Azure Pipelines

Making a web edit to trigger the pipeline

Questions or feedback? Let me know in the comments.