This article is an update of my previous benchmark, now running on Kubernetes 1.14 with CNI version up-to-date in April 2019.

First of all, many thanks to Cilium team who helped me by reviewing and correcting my metrics monitoring scripts.

What’s new since November 2018

If you just want to know what has changed since last time, here is a quick summary :

Flannel is still one of the fastest and leanest in the CNI competition, but still does not support NetworkPolicies, nor encryption.

Romana is not maintained anymore, so we decided to get it out of the benchmark.

WeaveNet does now support both Ingress and Egress NetworkPolicies ! But performance is lower than previously.

Calico still needs to manually customize MTU if you want to get the best performance. Calico provides two new options for installing their CNI, removing the need for a dedicated ETCD store :

storing state in Kubernetes API as datastore (cluster < 50 nodes)

storing state in Kubernetes API as datastore with Typha proxy to reduce the pressure on the K8S API (cluster > 50 nodes)

Calico announced support of Application Layer Policy on top of Istio, bringing security to the application layer.

Cilium now supports encryption! Cilium is providing encryption with IPSec tunnels and offers an alternative to WeaveNet for encrypted networking. However, WeaveNet is faster than Cilium with encryption enabled. That is due to Cilium 1.4.2 only support CBC encryption, GCM would be better as it can be hardware offloaded by network adapters, but it will be part of 1.5 version of Cilium.

Cilium is now much easier to deploy thanks to the ETCD operator they embed.

Cilium team also made efforts to reduce CNI footprint, by reducing memory consumption and CPU cost. But they are still heavier than other contestants.

Benchmark context

The benchmark is conducted on three Supermicro bare-metal servers connected through a Supermicro 10Gbit switch. The servers are directly connected to the switch via DAC SFP+ passive cables and are set up in the same VLAN with jumbo frames activated (MTU 9000).

Kubernetes 1.14.0 is set up on Ubuntu 18.04 LTS, running Docker 18.09.2 (default docker version on this release).

To improve reproducibility, we have chosen to always set up the master on the first node, to host the server part of the benchmark on the second server, and the client part on the third one. This is achieved via NodeSelector in Kubernetes deployments.

Here is the scale we will be using to describe benchmark results and interpretation:

Scale for results’ interpretation

Selection of CNIs for the benchmark

This benchmark only focuses on the CNI list integrated into the “create a single master cluster with kubeadm” part of the official kubernetes documentation. Among the 9 mentioned CNIs, we will only test 6 of them, excluding those we can not install easily and/or don’t work out of the box by following documentation (Romana, Contiv-VPP, and JuniperContrail/TungstenFabric)

Here is the list of CNIs we will compare :

Calico v3.6

Canal v3.6 (which is, in fact, Flannel for network + Calico for firewalling)

Cilium 1.4.2

Flannel 0.11.0

Kube-router 0.2.5

WeaveNet 2.5.1

Installation

The easiest a CNI is to set up, the best our first impression would be. All benchmarked CNIs were very easy to set up (one or two command lines).

As we said earlier, both server and switch are configured with Jumbo frames activated (by setting the MTU to 9000). We would really appreciate that a CNI could auto-discover the MTU to use, depending on the adapters. In fact, Cilium and Flannel are the only one to correctly auto-detect MTU. Most other CNIs have issues raised in GitHub to enable MTU auto-detection, but for now, we need to fix it manually by modifying a ConfigMap for Calico, Canal, and Kube-router, or via ENV var for WeaveNet.

Maybe you are wondering what is the impact of an incorrect MTU? Here is a chart showing the difference between WeaveNet with default MTU vs WeaveNet with Jumbo frames :

Impact of MTU setting on bandwidth performances

So, now that we know MTU is very important for performance, how does these CNIs auto-detect MTU :

MTU auto-detected by CNIs

As we see in the above graph, we have to apply some MTU tuning to Calico, Canal, Kube-router, and WeaveNet to get the best performance. Cilium and Flannel are able to correctly auto-detect MTU on their own, ensuring out-of-the-box best performance.

Security

When comparing the security of these CNIs, we are talking about two things: their ability to encrypt communications, and their implementation of Kubernetes Network Policies (according to real tests, not from their documentation).

There are only two CNIs that can encrypt communications: Cilium and WeaveNet. WeaveNet encryption is enabled by setting an encryption password as an ENV variable of the CNI. WeaveNet documentation is a bit confusing, but this is quite easy to do. Cilium encryption is set with commands that create Kubernetes Secrets and through daemonSet modification (a bit more complex than WeaveNet, but Cilium has documented it very well).

When it comes to the Network Policy implementation, Calico, Canal, Cilium, and WeaveNet are the best of the panel, by implementing both Ingress and Egress rules. Kube-router is actually implementing only Ingress rules.

Flannel does not implement Network Policies.

Here is the summary of the results :

Summary of security benchmark result

Performance

This benchmark shows the average bandwidth of three runs (at least) of each test. We are testing TCP and UDP performance (using iperf3), real applications like HTTP (using Nginx and curl), or FTP (using vsftpd and curl), and finally the behavior of application encryption with SCP protocol (using OpenSSH server and client).

For all tests, we also run the benchmark on the bare-metal nodes (green bar) to compare the effectiveness of the CNI vs native network performance. To maintain consistency with our benchmark scale, we use the following colors on the charts :

Yellow = Very good

Orange = Good

Blue = Fair

Red = Poor

Because we do not focus here on the performance of misconfigured CNIs, we will only show MTU tuned CNI benchmark results. (NOTA BENE: Cilium does not calculate correctly MTU if you activate encryption, so you must manually reduce MTU to 8900 in v1.4. Next version 1.5 will adapt automatically.)

Here are the results :

TCP performance

Every CNI is performing well with the TCP benchmark. CNIs with encryption enabled are far behind others, due to the cost of encryption.

UDP Performance

Again, in UDP benchmark, all CNIs are performing well. Encrypted CNIs are now very close to each other. Cilium is a just bit behind its competitors, but in fact, it is just 2,3% behind bare metal results, which is fair enough. What we should keep in mind is that both Cilium and Flannel are the only CNIs to correctly auto-detect MTU, thus providing these results out-of-the-box.

What about a real-world application? With HTTP benchmark, we can see that global performance is a bit lower than with TCP test. Even if HTTP is backed by TCP, in TCP benchmark iperf3 was configured to avoid any “TCP slow start” side effect, which can effectively impact HTTP benchmark. Everyone is pretty good here, Kube-router has a clear advantage, and WeaveNet is performing quite badly on this test, with about 20% less than bare metal. Both Cilium encrypted and WeaveNet encrypted are now far away from bare-metal performance.

With FTP, another TCP backed protocol, results are more mixed. While Flannel and Kube-router are performing very well, Calico, Canal, and Cilium are just a bit behind and are about 10% under bare-metal speed. WeaveNet is really far from bare-metal performance with a 17% gap. Anyway, the encrypted version of WeaveNet is performing about 40% better than Cilium encrypted.

With SCP, we can clearly see the encryption cost of SSH protocol. The majority of CNIs are performing well, but WeaveNet is, once again, a bit behind others. Cilium encrypted and WeaveNet encrypted are, of course, behind because of the double encryption cost (SSH encryption + CNI encryption).

Here is a summary of performances :

CNI performance summary

Resource consumption

Let’s now compare how these CNIs handle resource consumption while under heavy load (during TCP 10Gbit transfer). In performance tests, we compare CNIs to bare metal (green bar). For resources consumption tests, we also show the consumption of a fresh idle Kubernetes (purple bar) without any CNI setup. We can then figure out how much overhead a CNI really consumes.

Let’s begin with the memory aspect. Here is the average nodes RAM usage (without buffers/cache) in MB during transfer.

Memory consumption

Flannel and Kube-router are both performing very well, with only about 50MB memory footprint, followed by Calico and Canal with 70MB. WeaveNet consumption is clearly above its competitors with about 130MB footprint. With a 400MB memory footprint, Cilium has the highest memory consumption of the benchmark.

Now, let’s check the CPU consumption. Warning: Graph unit is not percent but permil. So 38 permil for bare metal is actually 3.8%. Here are the results :

CPU consumption

Calico, Canal, Flannel, and Kube-router are all very CPU efficient, with just 2% overhead compared to kubernetes without CNI. Far behind is WeaveNet with about 5% overhead, and then Cilium with more than 7% CPU overhead.

Here is a summary of resources consumption :

Summary

Here is an aggregated overview of all results :

Benchmark results overview

Conclusion

This final part is subjective and conveys my own interpretation of the results. Keep in mind that this benchmark is only testing throughput speed in a single connection, on a very small cluster (3 nodes). It doesn’t reflect large cluster (>50 nodes) networking behavior, nor many connections concurrency.

I suggest using the following CNIs if you are in a corresponding scenario :

You have low resource nodes in your cluster (only a few GB of RAM, a few cores) and you don’t need security features, go with Flannel . It is one of the leanest CNI we tested. Moreover, it is compatible with a very large number of architectures (amd64, arm, arm64, etc.). It is the only one, along with Cilium, that is able to correctly auto-detect your MTU, so you don’t have to configure anything to let it work. Kube-router is also good, but less standard and requires you to manually set the MTU.

in your cluster (only a few GB of RAM, a few cores) and you don’t need security features, go with . It is one of the leanest CNI we tested. Moreover, it is compatible with a very large number of architectures (amd64, arm, arm64, etc.). It is the only one, along with Cilium, that is able to correctly auto-detect your MTU, so you don’t have to configure anything to let it work. Kube-router is also good, but less standard and requires you to manually set the MTU. You need to encrypt your network for security reasons, go with WeaveNet. Don’t forget to set your MTU size if you are using jumbo frames and activate encryption by giving a password in an environment variable. But then again, forget about performance, this is the price for encryption.

for security reasons, go with Don’t forget to set your MTU size if you are using jumbo frames and activate encryption by giving a password in an environment variable. But then again, forget about performance, this is the price for encryption. For every other common usage, I would recommend Calico. This CNI is widely used in a lot of kubernetes deploying tools (Kops, Kubespray, Rancher, etc.). Just like WeaveNet, don’t forget to set the MTU in the ConfigMap if you are using jumbo frames. It has proven to be multipurpose and efficient in terms of resources consumption, performance, and security.

Last but not least, I would recommend you to follow Cilium work. Their team is very active, they are working hard to improve their CNI (feature, resource savings, performance, security, multi-cluster spanning, …), and their roadmap sounds very interesting.

Zoom on CNI selection

EDIT: Summary views have been updated to show raw CPU and RAM values and not calculated ones, as the calculation wasn’t obviously explained.