Transcript

Awesome! So, I’m Joe. I work at GitHub and today I’m going to talk about GLB. We started building GLB back in about 2015 and it went into production in 2016. It was built by myself and another engineer at GitHub, Theo Julienne. We built GLB to be a replacement for a fleet of unscalable, it happened to be, HAProxy hosts that the configurations were monolithic, untestable and just completely confusing and kind of terrible. That infrastructure was built long before I joined GitHub but I was there to replace it.

At that point in time, we had a number of design tenets that we wanted to follow when building GLB to kind of mitigate a lot of the issues that we had with the previous system. Some of those were: We wanted something to run it on commodity hardware. You saw in the previous talk about with the F5. We didn’t didn’t want to go down that F5 route.

We wanted something that scaled horizontally and something that supported high availability and avoided breaking TCP sessions anytime we could. We wanted something that supported connection draining, basically being able to pull hosts in and out of production easily, again without breaking TCP connections. We wanted something that was per service. We have a lot of different services at GitHub. github.com is one big service, but we have lots of internal services that we wanted to kind of isolate into their own HAProxy instances.

We wanted something that we could iterate on just like code anywhere else in GitHub and would live in a Git repository. We wanted something that was also testable at every layer so we can ensure that each component was working properly. We also have been expanding into multiple data centers across the globe and wanted to build something that was designed for multiple PoPs and data centers. And then lastly we wanted something that was resilient to DoS attacks because that’s very common for GitHub, unfortunately.