ADAM GLICK: We haven't so far. But why not start? What birthday are you thinking of?

CRAIG BOX: If this is episode 26, then it must be our six-month anniversary.

ADAM GLICK: [LAUGH] My wife's going to get jealous here.

CRAIG BOX: Raise a glass. Congratulations.

ADAM GLICK: Yeah. Well done, Craig. This has been a lot of fun.

CRAIG BOX: We spend a lot of time together.

ADAM GLICK: Indeed. It's been great. Thank you to all of you, the listeners, who have been listening, tuning in, and recommending this. The growth of the podcast has been spectacular, and it's all due to you. So thank you for both listening and the feedback that you've sent in.

CRAIG BOX: Indeed, it is our pleasure to bring this tale to you every week. It has been our pleasure to be back home in our respective locations for the last week. It's been turning autumn, or fall, if you like, here in Britain. So we have this one week in October where everything is mercifully bright and warm again. You think, oh, that's quite nice. And then basically the temperature halves the next week. We're in that week right now.

ADAM GLICK: Then you can just wander off into the pub and find solace, right?

CRAIG BOX: Yeah, beautiful setup there. And we have a few friends that like to go down to the local pub every now and then. And there's a quiz on Thursday nights. And last time we went, we won the pub quiz, which was good. And one of the prizes was free entry into the next pub quiz. So we took about six months with all our different travels and so on for us to be back in the same place. So we went down again last Thursday, and what do you know? We won again.

ADAM GLICK: It's a wonderful perpetual entry machine, I guess.

CRAIG BOX: Little bit.

ADAM GLICK: So congratulations. Let's get to the news.

[MUSIC PLAYING]

CRAIG BOX: The Kubernetes v2 Provider for Spinnaker, as discussed in episodes 23 and 24, has been released, along with Spinnaker 1.10. Spinnaker's redesigned Kubernetes support is now based on manifests. It automatically takes care of some of the lower-level complexities of manifest management, such as correct handling of labels. Planned features include traffic management, dynamic target selection, and SDO support. You can also try this new provider in a continuous delivery code lab that has been published by Google Cloud.

ADAM GLICK: With the upcoming KubeCon in December, there will also be a contributor summit for those of you who are, or would like to be, contributors to the Kubernetes project. There are 300 spots available, and registrations are now open. So if you're going to be at KubeCon Seattle and are a contributor or you're looking to become one, go sign up today.

CRAIG BOX: If you are considering enterprise support for containers off the cloud, you might like to read Forrester's report of 2018 enterprise vendors. Their research uncovered a market in which Docker, Red Hat, and Rancher Labs are considered leaders, Pivotal, Mesosphere, and IBM are considered strong performers, and SUSE and Platform9 are considered contenders. The report evaluates products that were generally available this year, so expect to see some new entrants in the top right-hand corner next year.

ADAM GLICK: Elections are now complete for the new Kubernetes steering committee members, who will serve a two-year term. Returning members are Aaron Crickenberger, now at Google, and Tim St Clair , now at Heptio. Quinton Hoole, from Huawei, is replaced by his colleague, Davanum Srinivas.

CRAIG BOX: Dominik Tornow from SAP and Andrew Chen from Google Cloud have published a guide to understanding high availability of the Kubernetes control plane, with some excellent diagrams which we totally can't explain to you in a podcast. Andrew has been leading an effort in SIG Docs to establish a canonical, high-level conceptual overview of Kubernetes in diagram format, and this is a great step in that direction.

ADAM GLICK: If you like your Kubernetes learning with a Northern English accent-- and who doesn't-- Nigel Poulton reached out to us to let us know he's published a new video training course on training website A Cloud Guru. Nigel is the author of two self-published books on Docker and Kubernetes and has published four hours of video with over 54 lessons.

CRAIG BOX: Episode 10, with the release leads for Kubernetes 1.11, included Tim Pepper, who is shadowing in preparation for leading the release of 1.12. Three months later, Tim has written a retrospective of the 1.12 process on the VMware Open Source Blog. He helps explain his observation. While open source moves fast in aggregate, any individual feature can take an excruciatingly long time to be ready, which can comfortably be restated as, a watched pot never boils.

ADAM GLICK: Working with multiple clusters? Admiralty are a nautically named project and/or startup who will eventually offer a multi-cluster compute cost optimization for Kubernetes. To do this, they wanted to build operators to manage workloads across cluster boundaries. Neither Kubebuilder or CoreOS's Operator SDK handle this use case, so they made something that did it and then open sourced it. The multi-cluster controller is based on the controller runtime that was factored out of the Kubebuilder and is also now used in the Operator SDK. Description of this is available on their website, and code is available on GitHub.

CRAIG BOX: Speaking of building operators, Google Cloud have published their best practices on this topic, based on their experience building operators for Apache Spark and Airflow. Palak Bhatia and Jun Xiang Tee talk about both the mechanics of writing an operator for a stateful workload and how to expose metrics correctly to make the application easy to monitor.

ADAM GLICK: Finally, congratulations to Pulumi, who completed their series A round for $15 million while simultaneously announcing their commercial platform for their infrastructure as a code offering, which is built on their open-source projects. Their enterprise version supports unlimited users, integrations with third-party tools like GitHub and Slack, as well as role-based access control, onboarding, and 12 by 5 support.

CRAIG BOX: And that's the news.

[MUSIC PLAYING]

Our guests today are Cyril Tovena and Mark Mandel, the main stewards and core contributors to the Agones project. Cyril is a technical lead on dedicated game servers at Ubisoft Montreal, and Mark works as a developer advocate at Google Cloud. Welcome, Cyril.

CYRIL TOVENA: Hi, Craig. Thanks for having me today.

CRAIG BOX: Welcome, Mark.

MARK MANDEL: Hey, how you doing?

CRAIG BOX: I think it's fair to assume that everyone listening will know what Google does. But Cyril, can you start by telling us who are Ubisoft, and what other brands might we know you better by?

CYRIL TOVENA: Sure. So Ubisoft is the biggest video game producer. And since I'm working in Montreal, the brands we're have there is "Watch Dogs," "For Honor," "Rainbow Six," and also "Assassin's Creed," to name a few of them.

ADAM GLICK: Mark, perhaps you can share with people what's your connection to the game stuff, because I know you do a lot there.

MARK MANDEL: Yeah, so I've been doing developer advocacy for games for the last 2 and 1/2, three years. Actually, it's probably been closer to three years. I came from more of a web DevOps background but have been very passionate about games for a really long time. And so when I saw that there was a bit of a gap in terms of doing developer advocacy for games and what was happening with cloud there, I sort of jumped right in with both feet. And here I am now.

CRAIG BOX: Are you only interested in the back end piece, or do you do work on front end of games as well?

MARK MANDEL: I am a terrible game designer and a terrible artist, so I try and stay as far away from those things as possible. Or if I do, I just lean into the terribleness and try and make games that are just awful.

CRAIG BOX: There's a subculture for that.

MARK MANDEL: Absolutely.

CRAIG BOX: So Agones is a library for hosting, running, and scaling dedicated game servers on Kubernetes. What exactly is a dedicated game server?

CYRIL TOVENA: So a game server is a server which usually acts as an authority. So it receives events from player, so player connects to it with the game client. They send event logs of what they want to do in the game. And then the game server can verify that information and then replicate the information to the other players so they can display their own log in the game.

MARK MANDEL: It's really like a full simulation of everything that happens inside your game. And dedicated game servers are really important for a couple of reasons, one of which is unfortunately, people cheat and they do horrible things. And so as soon as you give someone a binary in which they can intercept the traffic and do stuff or even hack the binary, they will do so. So if you have a client that can basically say, hey, I now have 1,000 points, people will find a way to basically hack it so that authoritatively, it'll say, I have 1,000 points or a million points.

But if you have a dedicated game server that sits on your own network, you can have the authority over that. And it's much harder for people to cheat. It's the dedicated game server that says, hey, this player who was playing this game for the last half an hour now has 1,000 points.

CRAIG BOX: It's like having a judge on a quiz show.

MARK MANDEL: Yeah, pretty much.

CRAIG BOX: That's a great answer, Mark. 1,000 points to you.

MARK MANDEL: [LAUGHING] The points don't matter. The other thing that's really important about dedicated game servers is it gives you geographical control over exactly where the game server is in relation to the players that are playing it. So dedicated game servers tend to run-- and Cyril, you can correct me on this-- usually around 30 hertz. Their tickrate is usually very important. And the speed at which you want to get data to your players is really important. We're talking tens of milliseconds type of stuff.

So having control over exactly where in the world a dedicated game server sits means that you have a lot of control over the experience your players have. So you can be like, you know, if I'm playing with Cyril, and he's in Montreal, I'm in San Francisco, we might want to place a server that's in between us so that we can both have a similar experience. If I'm playing with, say, someone who's in San Francisco, you want to put a server in San Francisco so we can have the best experience there. So that gives you a lot of control over exactly the experiences that your players have.

CYRIL TOVENA: So I just want to add, also, that even if maybe we are sitting together, maybe the connection between together is not going to be perfect. So a game server, usually, when you connect to a cloud provider, you have a direct connection or a better connection than a connection to another guy who is just your neighbor maybe.

CRAIG BOX: Do game servers even that out? Does the game server say, you're closer than the other player, who's further away, so I'll hold your traffic back a bit so it's fairer between the two of you?

MARK MANDEL: There's a whole weeds, if you want to go down that thing, of lag compensation and doing a lot of stuff to kind of lie to your players so that it looks like a smooth experience to everyone. That is a whole other conversation of how you deal with UDP packet transmissions and how you want to do communication to your game servers. And everyone does it slightly differently because they have different requirements. There's a whole area of research in that area.

ADAM GLICK: One of the things this sounds somewhat familiar and analogous to is what people do with online retail. And people look for the lowest latencies possible. How do you connect to make sure that people have a good experience? How do you make sure that people on the client end are not cheating the system in order to have things come up? What's unique and different about the game server scenario that separates that from online transaction processing or other online servers?

MARK MANDEL: That's an interesting question. I think there are probably similarities and differences. At the game server level, you're dealing with information at a much higher rate. At a retail level, I am going to go place an order, and then I'm going to check out, right? There's probably minutes between things, potentially. At the game server level, if I want to cheat, I can do things like aimbots, right, which basically intercept the information or look at the information inside the game, usually with a hacked client, to try and see what extra information I can get to automatically position my camera and my mouse in the perfect way so I can kill another player straightaway, I can get a win.

And trying to pick that up in real time can be kind of tricky, because how do you know, is this a really good player, or is this someone trying to cheat the system? And there's some intricacy there, and there's a lot of stuff. I mean, that's just one of the things that people can do. There's a whole array of all sorts of stuff. And again, cheating detection and fraud detection is a whole area of research as well.

CYRIL TOVENA: Yeah, one similarity will be that your customer won't be happy.

MARK MANDEL: [LAUGH] Yes.

ADAM GLICK: So what makes Kubernetes a particularly good platform for building a game server?

MARK MANDEL: Kubernetes itself is not going to help you build the game server. What Kubernetes actually does is help us host and run this game server. So Kubernetes itself, Kubernetes is not going to help you build a web server. It's going to help you run the web server, and scale the web server, and keep it alive and healthy. So this is basically exactly the same things that Kubernetes provides for other types of workloads, such as web stuff, or more stateful stuff. We're providing us with the same feasabilities for game servers themselves.

Now, game servers are a little unique, I think-- well, unique-ish. There's some other stuff that has similar life cycles, but that once a game server starts and has players connected to it, you can't kill it. Because players tend to get mad if you interrupt their game. They tend to get really sad. And so we have some unique stuff for that and some tooling for that, and we handle things a little specially, and we can get into that.

But your question about what makes Kubernetes special about that is Kubernetes, the way I like to think of it is, it's a programmatic interface for running processes at scale. I think at its base core level, that's what it does. And there's a lot of cool bells and whistles and syntactic sugar around enabling you to do that. But that's really it at its base level. And so having those programmatic tools, and extension mechanisms, and all the standard stuff that Kubernetes gives you out of the box is a lot of the stuff that game servers still need. And then the ecosystem that exists around Kubernetes is still just as viable and just as useful. So it saved us so much work in terms of just being able to get this up and running.

And because we can extend Kubernetes with custom resource definitions, with controllers, we could just layer on top of the stuff that Kubernetes gives us. And then all of a sudden, it's like, oh, now we have a secure API endpoint. Now we have kubectl commands that work just out of the box with game servers. I'm sitting there the other day. I'm using GKE because I run on cloud, and I use that. I'm using the dashboard for that, and that all integrates into that. So all those tools are just great.

CYRIL TOVENA: The abstraction that it gives.

ADAM GLICK: Where did you come up with the name for Agones?

MARK MANDEL: It was the thing I could get through legal. [LAUGHING] Basically, I was looking for random Greek words, because Kubernetes and Agones. So I should say it correctly. I hope I get this right. I believe it's meant to be "a-hon-es," rather than Agones, because I think with a hard G it means elbows, which is like a pre-Olympic gathering of people where you would have a sport event. It's like a public sporting event. I believe in a dramatic setting, it's like a grouping of characters coming together to do something really special. So I thought there was some really nice overlap there.

CRAIG BOX: My experience with dedicated game servers is basically running "QuakeWorld" on LANs back in the '90s. But back then, there with different engines and servers, it felt, for every game that came out. Today, it feels like we're in a landscape where we've got a small number of console vendors and a small number of engines. And we've obviously got mobile now with consoles. Is it fair to say that most games today are built with a small handful of engines?

CYRIL TOVENA: It depends on the company. At Ubisoft, we like to use our own engine. Companies that are smaller, they usually use an engine that has been already proven to be working nicely and easy to use. And there is a lot of asset out there. So it depends on which companies, but there's a couple of engine that are existing.

CRAIG BOX: And if I'm using one of those engines, does that vendor provide the game server back end as well?

CYRIL TOVENA: That's a good question. I think some have frameworks in this case for building. But since the game server is really tied to the game logic, you have to build your own. There is no magic bullet here.

CRAIG BOX: OK, so everyone's actually building their own server back end based on what the front end logic of their game is.

CYRIL TOVENA: Technically, yeah.

MARK MANDEL: Yeah, so if you're using something like a commercial engine that's publicly available, something like Unity or Unreal, you can create a dedicated game server from that. Actually both of them have tooling that enables you to create a game server from code that is usually quite tightly coupled between the client and the server. And they usually have some tooling in there as well for doing synchronization of data between the two, as well as low-level tools if you actually want to write your own communication mechanisms as well, which a lot of very large companies do as well.

CRAIG BOX: Do those companies also address the scaling problem? Have you worked with them on Agones, for example?

MARK MANDEL: So usually they try and solve the problem for multiplayer synchronization for you and try and provide you tools for that. There are companies out there that do do proprietary game server hosting, though.

CRAIG BOX: Would you describe Agones as an operator?

MARK MANDEL: If you assume an operator is a combination of custom resource definitions and a controller, then yes. Do we assume that?

CRAIG BOX: Sure. What other things above and beyond that have you had to build?

MARK MANDEL: So we have an integrated SDK that is something that's part of it, that basically we-- it's actually gRPC based, mostly, that needs to sit within your game server that tracks things like, is my game server ready to accept connections? Is my game server healthy? There is some configuration stuff in there so it can grab configuration information and pass configuration information out, so do some communication that way. And long term, we also want to do some stuff around statistic management and whatnot. So there is some game-server-specific logic that comes with that and makes it a tighter integration experience with Agones and the operator, essentially, that comes with it. Cyril, can you think of anything else that we've had to add?

CYRIL TOVENA: I was thinking about the webhooks, mutating webhooks.

CRAIG BOX: Mark, I remember we talked once about some custom auto scaling stuff you had to do.

MARK MANDEL: Yeah, so we are actually working on that now. In the last release, we talked about fleet auto scaling. So we have game servers, and we have Fleets. If you think Pod to Deployment, very similar. Fleets are just warm game servers that you can basically pull one out of as you need to. So we have some fleet auto scaling work. We've just got a basic strategy, and that will get expanded on.

Right now, I'm literally working on trying to work out the best way we can actually use the standard auto scaler so we don't have to do that work to respond to fleet auto scaling up and down. We have some interesting challenges there, but we're working on it.

ADAM GLICK: When I think about Kubernetes, I normally think about its ability to abstract away where you're running it and the compute that's kind of sitting underneath that to make anything that you're building in Kubernetes highly portable between environments, even between cloud and on-prem. Do you find that this is true when you're doing the game service, kind of in practice, as people are building these? Is that giving you the same flexibility?

MARK MANDEL: We are definitely aiming for vanilla Kubernetes. But since this is a Google Cloud project, we started with GKE first. We have found some small differences between cloud providers that have meant that we've either had to make small adjustments or put specific documentation in.

But generally speaking, it's all working pretty well. We do have some requirements for mutation webhooks and validation webhooks, which some cloud providers didn't have for a little while. But actually, I think, as of three days ago, now they do. So that's actually going to make things really easy.

But overall, it's actually been relatively smooth sailing. So that's been actually really good. And it's a definite aim of ours. We are specifically trying to make it so that people can run this on multiple cloud providers, on-prem. I mean, we've got run it on Minikube. That is the big thing for us. We want you to be able to run it on your local machine, if you have a QA cluster for your studio, and then across different cloud providers around the world. That's super important.

CRAIG BOX: You've mentioned the importance of keeping game servers close to the players. In the case where you have a fleet of clusters that are somewhat disconnected singletons and that you have clusters near various people, and they're all running the same game, but people aren't actually playing across them, what kind of multi-cluster support would you like to see come from the Kubernetes community to support a workload like that?

CYRIL TOVENA: I think right now this is still in design, and also very early. I look at the tool out there. Didn't find yet something that will help us to be able to connect to multiple clusters. But yeah, they're looking for that. I think Mark is in touch with a SIG Multicluster. Is that right?

MARK MANDEL: I've written down that I should talk to SIG Multicluster. [LAUGHING]

CYRIL TOVENA: Oh. OK. Yeah.

CRAIG BOX: What might you say to them if you did?

MARK MANDEL: I definitely need a registry of some kind for tracking different clusters, how to connect to them, their configuration information, their secrets, that kind of stuff. Registry health, making sure, do I know if this registry is up or down, or if I can take it up and down would be very useful. Having the registry be queryable so I can pull information out of it in a regular way would be awesome. I think those are probably the big things that I need. I don't know whether they exist or not. I haven't done the research.

CRAIG BOX: Yeah, the cluster Registry API should do most of what you've asked for.

MARK MANDEL: That would be lovely. I haven't looked at it though. I've got a note to look at it.

CYRIL TOVENA: I looked at it. And for now, it's just the API, right?

CRAIG BOX: And the binary, I think, to run it.

CYRIL TOVENA: Yeah, but it doesn't actually ensure that the credential are correct.

CRAIG BOX: The world is much better with game servers. Anyone can just connect to the registry any time and give anyone 1,000 points as they see fit.

ADAM GLICK: I suspect that you're both familiar with Open Match. I was curious for you to help people understand the differences and the relationship between Agones and Open Match.

MARK MANDEL: Open Match is a matchmaking platform that runs on top of Kubernetes. If people aren't familiar with matchmaking, it's just basically a thing that takes players and puts them together in some kind of sensible way so that they have a good game experience. And Open Match is a platform for enabling you to do that. Think of it as a very specialized App Engine specifically for running matchmaking logic. That's kind of how we look at it.

At some point, there will be a connector between Open Match and Agones. Both projects are still relatively new. Agones has been running for almost a year, but we're still sort of alpha-ish, whereas Open Match has been running for several months. So there's that.

We will want to have a connector so that, basically, what will happen is people will get match made using Open Match. And once they're match made, then it can basically go over to Agones and be like, hey, I have a match here. Can I have a game server for them to play on? Pass that information back to the players so they can make a direct connection to that game server. But they are both pieces of work coming from Google Cloud Platform for basically doing open-source platforms for game services, which is a whole fun thing that we're trying to do.

CRAIG BOX: A lot of our listeners will be playing their games in the evening and then working on enterprise software (TM) during the day. What lessons do you think that those people can adopt for running their own workloads on Kubernetes from Agones?

CYRIL TOVENA: Kubernetes is open source, so one lesson I took is ask everyone around. Go on the Kubernetes Slack channel and ask a question how people do. Look around on GitHub, and open question tickets on GitHub also if you're looking for help or how to do. So that's the beauty of it.

Since it's open source, it's very easy to find material on it. So that's what we did also, for Agones. I remembered I had to create the Helm chart for Agones. And I looked at some examples of what was nice, what wasn't, what we wanted. And that's how we built it.

MARK MANDEL: Yep.

CYRIL TOVENA: So look for examples.

MARK MANDEL: Absolutely. I think also that the other thing is don't be afraid of Custom Resource Definitions and controllers and operators. The documentation and the frameworks around this has got a lot better since I did it. And when I did it, basically, I think the best resource that I ever found was actually Joe Beda did a TGIK talking about writing a controller that was honestly invaluable. And just huge thanks to him for doing that because it made everything that we've done possible.

But there is now frameworks for operators. There's frameworks for powering custom resource definitions. Or you can use just the stuff that comes with client-go and look at those examples, which I think is totally fine. And once you get your head around how that works, it's actually not, I don't think, particularly difficult. But there's just a pattern that you kind of just pick up and run with. So don't be afraid of those.

If you're coming up to problems in Kubernetes and you're like, wow, it would be really great if I had an abstraction that could power this, or maybe if I built this thing, it would enable my team to do x, y, or z, don't be afraid of that. I think step into that. I think there's a really interesting and cool opportunity for having an ecosystem of operators and extensions to Kubernetes that could exist as a whole platform of things that people can do on top of Kubernetes. And I think that's something that's going to expand out. And I'm really excited about what can happen there.

ADAM GLICK: You mentioned that you're still fairly early on in the project. How far away do you feel you are from a 1.0 release? And what direction would you like to see the product take as you reach maturity?

MARK MANDEL: I've been saying 1.0 in 2019, which I know is completely broad, and deliberately so. [LAUGHING]

CRAIG BOX: Calendar 2019.

MARK MANDEL: Calendar 2019. And I don't feel that's unreasonable. We have a set of tickets that are all available on our GitHub repository, which I think lays out a lot of our major features. We go through a design process before we do pretty much everything.

The big-ticket items for me personally-- and again, we need to talk about this as a community, and we're quite community driven-- we want to have a whole layer for statistic collection display, so basically the whole ops integration layer. Multi-cluster is there too. We're working through auto scaling. I'm looking for other fun stuff that would probably personally prevent a 1.0. I think outside of that, I think those are probably the big things.

We're chugging through those, which I'm pretty excited by. So I think 1.0 next year sounds pretty reasonable. Cyril, would you disagree with me? I'm making arbitrary decisions while we're on a podcast, by the way.

CYRIL TOVENA: No, I think the product is already great. There is, like you said, the biggest scaling and multi-cluster stuff. That's the biggest missing pieces, also dashboard and monitoring.

MARK MANDEL: Yeah.

CYRIL TOVENA: But yeah, Unity also plugin, maybe for some folks out there would be interested.

MARK MANDEL: Yep.

CYRIL TOVENA: I know we have a design ticket for it.

MARK MANDEL: Yeah, and it's worth noting, people are developing on this already. It's been really great. I don't want to name names because I don't have permission to yet. But if you get involved with the community, it's pretty easy to work out who those people are.

But yeah, people are working on it. People are developing games on it right now. They're using it. And the feedback has been great, and the response has been great. And it's been hugely positive, actually. I've been super proud of the work you and I have done, Cyril, and everyone else who's contributed, by the way, as well.

ADAM GLICK: Cyril, Mark, thank you so much for joining us today.

CYRIL TOVENA: Thank you, guys.

MARK MANDEL: Thank you very much for having us. If you want to learn more, agones.dev.

ADAM GLICK: You can find Mark on Twitter as @neurotic and Cyril on Twitter as @kuqd. You can find Agones at https://agones.dev, or on Twitter at @AgonesDev.

[MUSIC PLAYING]

CRAIG BOX: That's the end of six months of the "Kubernetes Podcast." Thank you for listening. As always, if you've enjoyed the show, please continue to help us spread the word and tell a friend. If you didn't enjoy the show, tell two friends, but let them make their own opinion. If you have any feedback for us, you can find us on Twitter, at @KubernetesPod, or you can reach us by email at kubernetespodcast@google.com.

ADAM GLICK: You can also check out our website at kubernetespodcast.com, where you can find transcripts of each of the episodes. Until next time, take care.

CRAIG BOX: See you next week

[MUSIC PLAYING]