Kubernetes & IPVS

In this article, we explain the IPVS feature now available on Kubernetes (1.9 and later).

Intended audience: sys admins learning k8s or working with Kubernetes. Basic knowledge of Kubernetes architecture and workflows is recommended to fully understand the benefits of IPVS.

By Flavien Hardy, Cloud Consultant @ Objectif Libre

What is IPVS?

IPVS is a kernel feature providing layer 4 load balancing. It is also called Layer 4 switching. A stable version of IPVS is available since Linux 2.6.

In a nutshell, IPVS is used to expose an entrypoint service with a unique virtual IP. All TCP/UPD traffic going through this endpoint is load-balanced between physical servers.

How does this work with Kubernetes?

The IPVS definition is also the description of a ClusterIP service in Kubernetes.

__ --------------------- -------------- / POD1 [ Incomming traffic ] ---> [ Service IP ] ->--- POD2 --------------------- -------------- \__ POD3

In the previous versions of Kubernetes, services (managed by kube-proxy) were implemented with IPTables rules. The IPVS feature is intended to replace this mechanism: instead of setting new iptables rules, kube-proxy can now use the IPVS mechanism to implement the services.

IPVS Vs IPTables?

As stated before, the default service implementation in Kubernetes uses IPTables. Large deployements (5000+ services) reach the IPTables limits:

Low performance regarding packet processing

Low performance for new rules insertion

These performance issues are due to IPTables and Netfilter implementation: rules are evaluated sequentially for each incoming packet. The more rules there are, the longer the processing takes. The IPVS implementation differs from IPTables: it uses a hash table managed by the kernel to establish the destination of a packet.

Firewall management is the first use case for IPTables but, in case of massive packet processing, performances collapse.

The following measurements (realized and provided by Haibin Xie) show the performance differences between IPVS and IPTables:

Metrics Number of services IPVS IPTables Service access time 1.000 10.000 50.000 10ms 9ms 9ms 7-18ms 80-7000ms Non-fonctionnel Memory usage 1.000 10.000 50.000 386 MB 542 MB 1272 MB 1.1G 2.3G OOM CPU usage 1.000 10.000 50.000 0% N/A 50%-100% N/A

How to setup IPVS?

IPVS is provided as a beta feature in current Kubernetes 1.9. To use it you must enable the SupportIPVSProxyMode feature gate.

If you deploy your cluster with Kubespray, add the following parameter in the k8s-cluster.yml configuration file:

kube_proxy_mode : ipvs

Impact ( kube-proxy only):

Enables the SupportIPVSProxyMode feature gate

feature gate IPVS proxy

Loads additional kernel modules ( ip_vs_rr , ip_vs__wrr , ip_vs_sh , nf_conntrack_ipv4 )

The IPVS feature will be declared as stable in Kubernetes 1.10 (https://github.com/kubernetes/kubernetes/pull/58442).

Additional benefits

Load balancing

IPVS for kube-proxy allows the administrator to choose between the most common load balancing methods: round robin (default), least connection, destination hashing,…

Currently, the load balancing method cannot be changed for a specific service (related GitHub issue).

As of today, if you want to update the defaut load balancing method in Kubespray, you must update the kube-proxy manifest template

Administration

IPVS comes with a handy CLI: ipvsadm . It is much more efficient than shell tricks like iptables -L -t nat | grep PATTERN .

Example:

~ # ipvsadm -l -n IP Virtual Server version 1.2.1 (size=4096) Prot LocalAddress:Port Scheduler Flags -> RemoteAddress:Port Forward Weight ActiveConn InActConn TCP 10.233.0.1:443 rr persistent 10800 -> 195.154.162.187:6443 Masq 1 0 0 -> 195.154.165.191:6443 Masq 1 0 0 -> 62.210.115.35:6443 Masq 1 2 0 [...]

This shows the IPVS rules for the in-cluster API server service default/kubernetes:443 . The virtual IP is 10.233.0.1:443 , incoming TCP packets are load balanced between three Kubernetes master nodes ( IP:6443 ) using round-robin.