Feb 3 2019

What would a EvE online Internet look like?

Translations are available in: русский

EvE online is a fascinating game. It’s one of the few MMO’s that only have a single “server” to log into, meaning everyone is playing on the same logical world. It’s also had a fascinating set of events that have happened inside it, and also remains a very visually appealing game for what it is:

It is also home to a expansive world map to hold all of these players. At its peak EvE had 63,000 players online in a single world with 500,000 paying subscriptions on top, and while that number is getting lower by the year the world remains infamously large. Meaning that to get from one side to another is a sizable amount of time (and risk due to player owned factions).

You travel to different areas using warping (within the same system) or jumping to different systems using a jump gate:

And all of these systems combine together to produce a map of beauty and complexity:

I always viewed this map like a network, it’s a large mesh of systems that connect to each other so that people can get across, with most systems having more than two jump gates. this lead me to think what would happen if you took the idea of the map being a network literally? What would a EvE online internet of systems look like?

For this we need to understand how the real Internet works. The internet is a large collection of ISP’s that are all numerically identified with a standardised and unique ISP number, called an Autonomous System Number or ASN ( or AS for shorter ). These AS’es need a way to exchange routes with each other, since they will own ranges of IP addresses, and need a way to tell other ISPs that their routers can route these IP addresses. For this, the world has settled on Border Gateway Protocol or BGP.

BGP works by “announcing” to another AS (known as a peer) their routes:

The default behavior of BGP is when it gets a route from a peer, is to give it to all of the other peers it is connected to as well. This means that peers will share their view of the routing table with each other automatically.

However this default behavior is only useful if you are using BGP to send routes in to internal routers, since the modern internet has different logical relationships with each other. Instead of a mesh, the modern internet looks more like this:

However EvE online is set in the future. Who knows if the internet still relies on this for-profit routing layout. Let’s pretend that it doesn’t so we can see how BGP scales in larger networks.

For this we will need to simulate real BGP router behavior and connections. Giving that EvE has a relatively low 8000~ systems, and a reasonable 13.8k links between them. I figured it was not impossible to actually run 8000~ virtual machines with real BGP and networking on them to find out how these real systems looked like when acting together as a network.

However, we do not have unlimited resources, so we will have to find a way to make the smallest Linux image possible in both disk and memory consumption. For this I looked towards embedded systems, since embedded systems often have to run in very low resource environments. I stumbled upon Buildroot and within a few hours had a small Linux image with just the things I needed to make this project work.

$ ls -alh total 17M drwxrwxr-x 2 ben ben 4.0K Jan 22 22:46 . drwxrwxr-x 6 ben ben 4.0K Jan 22 22:45 .. -rw-r--r-- 1 ben ben 7.0M Jan 22 22:46 bzImage -rw-r--r-- 1 ben ben 10M Jan 22 22:46 rootfs.ext2

This image contains a bootable linux setup that also has: * Bird 2 BGP Daemon * tcpdump and My Traceroute (mtr) for network debugging * busybox for a basic shell and system utilities

This image can easily be run in qemu with a reasonably little amount of options:

qemu-system-i386 -kernel ../bzImage \ -hda rootfs.ext2 \ -hdb fat:./30045343/ \ -append "root=/dev/sda rw" \ -m 64

For networking I chose to use an undocumented feature of qemu (in my version), in where you can point two qemu processes at each other and use UDP sockets to transfer data between them. This is handy as we are planning to provision a lot of links, so using the normal method of TUN/TAP adapters would get messy fast.

Since this feature is somewhat undocumented, there were some frustrations getting it working. It took me a significant amount of time to discover that the net name on the command line needs to be the same for both sides of the connection. At a later glance it appears that this option is now better documented however as with all things. These changes take awhile to propagate to downstream distributions.

Once that was working, we have a pair of VM’s that can send packets between them with the hypervisor carrying them as UDP datagrams. Since we will be running a lot of these systems, we need a fast way to configure them all using pre-generated configuration. For this we can use a handy feature of qemu that allows you to take a directory on the hypervisor and turn it into a virtual FAT32 filesystem. This is useful since it allows us to make a directory per system we plan to run, and have each qemu process point to that directory, meaning we can use the same boot image for all VM’s in the cluster.

Since each system has 64MB of RAM, and we are planning to run 8000~ VM’s, we will clearly need a decent amount of RAM. For this I used 3 of packet.net’s m2.xlarge.x86 instances, since they offer 256GB of RAM with 2x Xeon Gold 5120’s meaning they have a decent amount of grunt to them.

I used another open source project to produce the EvE map in JSON form, and then have a program of my own produce configuration from that data. After having a few test runs of just a few systems I proved that they could take configuration from the VFAT and establish BGP sessions with each other over that.

So I then took the plunge at booting the universe:

At first I attempted to launch all systems in one big event, unfortunately this ended up causing a big bang in regards to the load of the system, so after that I switched to launching a system every 2.5 seconds, and the 48 cores on the system ended up taking care of that nicely.

During the booting process I found you would see large “explosions” of CPU usage over all of the VM’s, I later figured out this was large sections of the universe getting connected to each other, thus causing large amounts of BGP traffic to both sides of the newly connected VM’s.

root@evehyper1:~/147.75.81.189# ifstat -i bond0,lo bond0 lo KB/s in KB/s out KB/s in KB/s out 690.46 157.37 11568.95 11568.95 352.62 392.74 20413.64 20413.64 468.95 424.58 21983.50 21983.50

In the end we got to see some pretty awesome BGP paths, since every system is announcing a /48 of IPv6 addresses, you could see the routes to every system, and all of the other systems it would have to route though to get there.

$ birdc6 s ro all 2a07:1500:b4f::/48 unicast [session0 18:13:15.471] * (100) [AS2895i] via 2a07:1500:d45::2215 on eth0 Type: BGP univ BGP.origin: IGP BGP.as_path: 3397 3396 3394 3385 3386 3387 2049 2051 2721 2720 2719 2692 2645 2644 2643 145 144 146 2755 1381 1385 1386 1446 1448 849 847 862 867 863 854 861 859 1262 1263 1264 1266 1267 2890 2892 2895 BGP.next_hop: 2a07:1500:d45::2215 fe80::5054:ff:fe6e:5068 BGP.local_pref: 100

I took a snapshot of the routing table on each router in the universe and then graphed out the highly used systems to get to other systems, however this image is huge. So here is a small version of it in this post, if you click on the image, be aware this image will likely make your device run out of memory

After this I got thinking, what else could you map out in BGP routed networks? Could you use a smaller model of this to test out how routing config worked in large scale networks? I produced a file that mapped out the London underground system to test this out:

The TFL system is much smaller, and has a lot more hops that only have one direction to go in since most stations only have a single “line” of transport, however we can learn one thing out of this, we can use this to harmlessly play with BGP MED’s.

There is however a problem when we view the TFL map as a BGP network, in the real world the time/latency between every stop is not the same, and so if we were to simulate this latency we would not be moving our way around the system as fast as we could be, since we are only looking at the fewest stations to pass through.

However thanks to a Freedom Of Information Act request (FOIA) that was sent to TFL, we have the time it takes to get from one station to another. These got generated into the BGP configuration of the routers, for example:

protocol bgp session1 { neighbor 2a07:1500:34::62 as 1337; source address 2a07:1500:34::63; local as 1337; enable extended messages; enable route refresh; ipv6 { import filter{ bgp_med = bgp_med + 162; accept; }; export all; }; } protocol bgp session2 { neighbor 2a07:1500:1a::b3 as 1337; source address 2a07:1500:1a::b2; local as 1337; enable extended messages; enable route refresh; ipv6 { import filter{ bgp_med = bgp_med + 486; accept; }; export all; }; }

In the session1 the time distance between the two stations is 1.6 mins, the other path out of that station is 4.86 mins away. This number is added on to the route for each station/router it passes through. Meaning every router/station on the network knows it’s time to get to every station through every route:

This means that traceroutes are accurate looks into how you can move around London, for example my local station to Paddington:

We can also have some fun with BGP by simulating some maintenance or a incident at Waterloo station:

And how the entire network instantly picks the next fastest route, rather than the one with the fewest stations to pass through.

And that’s the magic of BGP MED’s in routing!

The code for all of this is open now, You can build your own network structures with a reasonably simple JSON schema, or just use EvE online or TFL since that is already in the repo.

You can find all of the code for this here: https://github.com/benjojo/eve-online-bgp-backbone

If this is the sort of madness you enjoy, you will enjoy the last half year where I’ve been doing various other things like playing battleships with BGP or Testing if RPKI works. To keep up to date with my posting you should use the blog’s RSS feed, or follow me on twitter.

I would also like to thank Basil Fillan (Website/Twitter) for helping out with the BIRD2 iBGP configuration in this post.

It is worth mentioning that I am looking for a EU based job sometime in April 2019, so if you are (or know) a company doing interesting things that align with some of the sensible or non sensible things I do. Please let me know!