This is part three of a five four-part series on scaling game servers with Kubernetes.

In the previous two posts we looked at hosting dedicated game servers on Kubernetes and measuring and limiting their memory and CPU resources. In this instalment we look at how we can use the CPU information from the previous post to determine when we need to scale up our Kubernetes cluster because we’ve run out of room for more game servers as our player base increases.

Separating Apps and Game Servers

The first step we should make before starting to write code to increase the size of the Kubernetes cluster, is to separate our applications — such as match makers, the game server controllers, and the soon-to-be-written node scaler — onto different nodes in the cluster than where the game servers would be running.

This has several benefits:

The resource usage of our applications is now going to have no effect on the game servers, as they are on different machines. This means that if the matchmaker has a CPU spike for some reason, there is an extra barrier to ensure there is no way it could unduly affect a dedicated game server in play. It makes scaling up and down capacity for dedicated game servers easier – as we only need to look at game server usage across a specific set of nodes, rather than all potential containers across the entire cluster. We can use bigger machines with more CPU cores and memory for the game server nodes, and smaller machines with less cores and memory for the controller applications as they need less resources, in this instance. We essentially are able to pick the right size of machine for the job at hand. This is gives us great flexibility while still being cost effective.

Kubernetes makes setting up a heterogenous cluster relatively straightforward and gives us the tools to specify where Pods are scheduled within the cluster – via the power of Node Selectors on our Pods.

It’s worth noting that that there is also a more sophisticated Node Affinity feature in beta, but we don’t need it for this example, so we’ll ignore its extra complexity for now.

To get started, we need to assign labels (a set of key-value pairs) to the nodes in our cluster. This is exactly the same as you would have seen if you’ve ever created Pods with Deployments and exposed them with Services, but applied to nodes instead. I’m using Google Cloud Platform’s Container Engine, and it uses Node Pools to apply labels to nodes in the cluster as they are created and set up heterogenous clusters – but you can also do similar things on other cloud providers, as well as directly through the Kubernetes API or the command line client.

In this example, I added the labels role:apps and role:game-server to the appropriate nodes in my cluster. We can then add a nodeSelector option to our Kubernetes configurations to control which nodes in the cluster Pods are scheduled onto.

For example, here is the configuration for the matchmaker application, where you can see the nodeSelector set to role:apps to ensure it has container instances created only on the application nodes (those tagged with the “apps” role).

deployment.yaml apiVersion: extensions/v1beta1 kind: Deployment metadata: name: matchmaker spec: replicas: 5 template: metadata: labels: role: matchmaker-server spec: nodeSelector: role: apps # here is the node selector containers: - name: matchmaker image: gcr.io/soccer/matchmaker ports: - containerPort: 8080 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 apiVersion : extensions/v1beta1 kind : Deployment metadata : name : matchmaker spec : replicas : 5 template : metadata : labels : role : matchmaker-server spec : nodeSelector : role : apps # here is the node selector containers : - name : matchmaker image : gcr.io/soccer/matchmaker ports : - containerPort : 8080

By the same token, we can adjust the configuration from the previous article to make all the dedicated game server Pods schedule just on the machines we specifically designated for them, i.e. those tagged with role: game-server :

pod.yaml apiVersion: v1 kind: Pod metadata: generateName: "game-" spec: hostNetwork: true restartPolicy: Never nodeSelector: role: game-server # here is the node selector containers: - name: soccer-server image: gcr.io/soccer/soccer-server:0.1 env: - name: SESSION_NAME valueFrom: fieldRef: fieldPath: metadata.name resources: limits: cpu: "0.1" 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 apiVersion : v1 kind : Pod metadata : generateName : "game-" spec : hostNetwork : true restartPolicy : Never nodeSelector : role : game-server # here is the node selector containers : - name : soccer-server image : gcr.io/soccer/soccer-server :0.1 env : - name : SESSION_NAME valueFrom : fieldRef : fieldPath : metadata.name resources : limits : cpu : "0.1"

Note that in my sample code, I use the Kubernetes API to provide a configuration identical to the one above, but the yaml version is easier to understand, and it is the format we’ve been using throughout this series.

A Strategy for Scaling Up

Kubernetes on cloud providers tends to come with automated scaling capabilities, such as the Google Cloud Platform Cluster Autoscaler, but since they are generally built for stateless applications, and our dedicated game servers store the game simulation in memory, they won’t work in this case. However, with the tools that Kubernetes gives us, it’s not particularly difficult to build our own custom Kubernetes cluster autoscaler!

Scaling up and down the nodes in a Kubernetes cluster probably makes more sense for a cloud environment, since we only want to pay for the resources that we need/use. If we were running in our own premises, it may make less sense to change the size of our Kubernetes cluster, and we could just run a large cluster(s) across all the machines we own and leave them at a static size, since adding and removing physical machines is far more onerous than on the Cloud and wouldn’t necessarily save us money since we own/lease the machines for much longer periods.

There are multiple potential strategies for determining when you want to scale up the number of nodes in your cluster, but for this example we’ll keep things relatively simple:

Define a minimum and maximum number of nodes for game servers, and make sure we are within that limit.

Use CPU resource capacity and usage as our metric to track how many dedicated game servers we can fit on a node in our cluster (in this example we’re going to assume we always have enough memory).

Define a buffer of CPU capacity for a set number of game servers at all times in the cluster. I.e. add more nodes if at any point you couldn’t add n number of servers to the cluster without running out of CPU resources in the cluster at any point in time.

Whenever a new dedicated game server is started, calculate if we need to add a new node in the cluster because the CPU capacity across the nodes is under the buffer amount.

As a fail-safe, every n seconds, also calculate if we need to add a new node to the cluster because the measured CPU capacity resources are under the buffer.

Creating a Node Scaler

The node scaler essentially runs an event loop to carry out the strategy outlined above.

Using Go in combination with the native Kubernetes Go client library makes this relatively straightforward to implement, as you can see below in the Start() function of my node scaler.

Note that I’ve removed most of the error handling and other boilerplate to make the event loop clearer, but the original code is here if you are interested.

server.go // Start the HTTP server on the given port func (s *Server) Start() error { // Access Kubernetes and return a client s.cs, _ = kube.ClientSet() // ... there be more code here ... // Use the K8s client's watcher channels to see game server events gw, _ := s.newGameWatcher() gw.start() // async loop around either the tick, or the event stream // and then scaleNodes() if either occur. go func() { log.Print("[Info][Start] Starting node scaling...") tick := time.Tick(s.tick) // ^^^ MAIN EVENT LOOP HERE ^^^ for { select { case <-gw.events: log.Print("[Info][Scaling] Received Event, Scaling...") s.scaleNodes() case <-tick: log.Printf("[Info][Scaling] Tick of %#v, Scaling...", tick) s.scaleNodes() } } }() // Start the HTTP server return errors.Wrap(s.srv.ListenAndServe(), "Error starting server") } 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 // Start the HTTP server on the given port func ( s * Server ) Start ( ) error { // Access Kubernetes and return a client s . cs , _ = kube . ClientSet ( ) // ... there be more code here ... // Use the K8s client's watcher channels to see game server events gw , _ : = s . newGameWatcher ( ) gw . start ( ) // async loop around either the tick, or the event stream // and then scaleNodes() if either occur. go func ( ) { log . Print ( "[Info][Start] Starting node scaling..." ) tick : = time . Tick ( s . tick ) // ^^^ MAIN EVENT LOOP HERE ^^^ for { select { case < - gw . events : log . Print ( "[Info][Scaling] Received Event, Scaling..." ) s . scaleNodes ( ) case < - tick : log . Printf ( "[Info][Scaling] Tick of %#v, Scaling..." , tick ) s . scaleNodes ( ) } } } ( ) // Start the HTTP server return errors . Wrap ( s . srv . ListenAndServe ( ) , "Error starting server" ) }

For those of you who aren’t as familiar with Go, let’s break this down a little bit:

kube.ClientSet() – we have a small piece of utility code, which returns to us a Kubernetes ClientSet that gives us access to the Kubernetes API of the cluster that we are running on. gw, _ := s.newGameWatcher – Kubernetes has APIs that allow you to watch for changes across the cluster. In this particular case, the code here returns a data structure containing a Go Channel (essentially a blocking-queue), specifically gw.events , that will return a value whenever a Pod for a game is added or deleted in the cluster. Look here for the full source for the gameWatcher. tick := time.Tick(s.tick) – this creates another Go Channel that blocks until a given time, in this case 10 seconds, and then returns a value. If you would like to look at it, here is the reference for time.Tick. The main event loop is under the “// ^^^ MAIN EVENT LOOP HERE ^^^” comment. Within this code block is a select statement . This essentially declares that the system will block until either the gw.events channel or the tick channel (firing every 10s) returns a value, and then execute s.scaleNodes() . This means that a scaleNodes command will fire whenever a game server is added/removed or every 10 seconds. s.scaleNodes() – run the scale node strategy as outlined above.

Within s.scaleNodes() we query the CPU limits that we set on each Pod, as well as the total CPU available on each Kubernetes node within the cluster, through the Kubernetes API. We can see the configured CPU limits in the Pod specification via the Rest API and Go Client, which gives us the ability to track how much CPU each of our game servers is taking up, as well as any of the Kubernetes management Pods that may also exist on the node. Through the Node specification, the Go client can also track the amount of CPU capacity available in each node. From here it is a case of summing up the amount of CPU used by Pods, subtracting it from the capacity for each node, and then determining if one or more nodes need to be added to the cluster, such that we can maintain that buffer space for new game servers to be created in.

If you dig into the code in this example, you’ll see that we are using the APIs on Google Cloud Platform to add new nodes to the cluster. The APIs that are provided for Google Compute Engine Managed Instance Groups allow us to add (and remove) instances from the Nodepool in the Kubernetes cluster. That being said, any cloud provider will have similar APIs to let you do the same thing, and here you can see the interface we’ve defined to abstract this implementation detail in such a way that it could be easily modified to work with another provider.

Deploying the Node Scaler

Below you can see the deployment YAML for the node scaler. As you can see, environment variables are used to set all the configuration options, including:

Which nodes in the cluster should be managed

How much CPU each dedicated game server needs

The minimum and maximum number of nodes

How much buffer should exist at all times

deployment.yaml apiVersion: extensions/v1beta1 kind: Deployment metadata: name: nodescaler spec: replicas: 1 # only want one, to avoid race conditions template: metadata: labels: role: nodescaler-server spec: nodeSelector: role: apps strategy: type: Recreate containers: - name: nodescaler image: gcr.io/soccer/nodescaler env: - name: NODE_SELECTOR # the nodes to be managed value: "role=game-server" - name: CPU_REQUEST # how much CPU each server needs value: "0.1" - name: BUFFER_COUNT # how many servers do we need buffer for value: "30" - name: TICK # how often to tick over and recheck everything value: "10s" - name: MIN_NODE # minimum number of nodes for game servers value: "1" - name: MAX_NODE # maximum number of nodes for game servers value: "15" 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 apiVersion : extensions/v1beta1 kind : Deployment metadata : name : nodescaler spec : replicas : 1 # only want one , to avoid race conditions template : metadata : labels : role : nodescaler-server spec : nodeSelector : role : apps strategy : type : Recreate containers : - name : nodescaler image : gcr.io/soccer/nodescaler env : - name : NODE_SELECTOR # the nodes to be managed value : "role=game-server" - name : CPU_REQUEST # how much CPU each server needs value : "0.1" - name : BUFFER_COUNT # how many servers do we need buffer for value : "30" - name : TICK # how often to tick over and recheck everything value : "10s" - name : MIN_NODE # minimum number of nodes for game servers value : "1" - name : MAX_NODE # maximum number of nodes for game servers value : "15"

You may have noticed that we set the deployment to have replicas: 1 . We did this because we always want to have only one instance of the node scaler active in our Kubernetes cluster at any given point in time. This ensures that we do not have more than one process attempting to scale up, and eventually scale down, our nodes within the cluster, which could definitely lead to race conditions and likely cause all kinds of weirdness.

Similarly, to ensure that the node scaler is properly shut down before creating a new instance of it if we want to update the node scaler, we also configure strategy.type: Recreate so that Kubernetes will destroy the currently running node scaler Pod before recreating the newer version on updates, also avoiding any potential race conditions.

See it in Action

Once we have deployed our node scaler, let’s tail the logs and see it in action. In the video below, we see via the logs that when we have one node in the cluster assigned to game servers, we have capacity to potentially start forty dedicated game servers, and have configured a requirement of a buffer of 30 dedicated game servers. As we fill the available CPU capacity with running dedicated game servers via the matchmaker, pay attention to how the number of game servers that can be created in the remaining space drops and eventually, a new node is added to maintain the buffer!

Next Steps

The fact that we can do this without having to build so much of the foundation is one of the things that gets me so excited about Kubernetes. While we touched on the Kubernetes client in the first post in this series, in this post we’ve really started to take advantage of it. This is what I feel the true power of Kubernetes really is – an integrated set of tools for running software over a large cluster, that you have a huge amount of control over. In this instance, we haven’t had to write code to spin up and spin down dedicated game servers in very specific ways – we could just leverage Pods. When we want to take control and react to events within the Kubernetes cluster itself, we have the Watch APIs that enable us to do just that! It’s quite amazing the core set of utility that Kubernetes gives you out of the box that many of us have been building ourselves for years and years.

That all being said, scaling up nodes and game servers in our cluster is the comparatively easy part; scaling down is a trickier proposition. We’ll need to make sure nodes don’t have game servers on them before shutting them down, while also ensuring that game servers don’t end up widely fragmented across the cluster, but in the next post in this series we’ll look at how Kubernetes can also help in these areas as well!

In the meantime, as with the previous posts – I welcome questions and comments here, or reach out to me via Twitter. You can see my presentation at GDC this year as well as check out the code in GitHub, which is still being actively worked on!

All posts in this series: