The new LTS version of DPDK was recently released (on November 29th). I’ve been using DPDK with Open vSwitch for a long time, so I decided to test out the newest version and see if the updating overhead is worth it.

TL;DR

Some background

DPDK (data plane development kit) is an awesome open-source project. Its goal is to make handling network traffic faster, by moving the workload from the kernel to the application itself. It does so by disconnecting the NIC from the kernel and connecting directly to it using a kernel module. If you want to learn more about it, check out their site:

DPDK is very useful, but it only provides a framework — not a complete app. Instead of writing a network switch app for my production switch, I used another great open-source project called Open vSwitch (OVS). OVS is a software network switch that has a DPDK based version. OVS also has tons of advanced features, such as: sFlow, traffic filtering and traffic shaping. You can download it from here:

My story

As an IT engineer, part of my job is to design new networks, where I often find that the main bottleneck is the network switches. About a year ago I searched for a more robust switch implementation to use in a new large network I designed, and I came across OVS. After reading more about it, and doing a lot of testing, I started using OVS as my go-to production switch.

The current OVS version (which I’ve been using) uses DPDK 18.11. When the new DPDK version was released, I wondered whether I should update my switches. Mainly, I had two questions:

How easy is it to upgrade OVS to the new DPDK version? Is the new version of DPDK better?

Compiling Open vSwitch with DPDK 19.11

OVS has a very detailed guide about compiling DPDK 18.11. They haven’t updated it to version 19.11 (yet, hopefully) so I needed to figure out how to do so by myself.

I started out with trying to compile everything exactly as the guide says. It wasn’t very surprising that a few errors appeared:

LIB/NETDEV-DPDK.C:86:38: ERROR: ‘ETHER_HDR_LEN’ UNDECLARED (FIRST USE IN THIS FUNCTION)

LIB/NETDEV-DPDK.C:86:54: ERROR: ‘ETHER_CRC_LEN’ UNDECLARED (FIRST USE IN THIS FUNCTION)

LIB/NETDEV-DPDK.C:618:20: ERROR: ‘ETHER_MTU’ UNDECLARED (FIRST USE IN THIS FUNCTION)

LIB/NETDEV-DPDK.C:88:46: ERROR: ‘ETHER_HDR_LEN’ UNDECLARED (FIRST USE IN THIS FUNCTION)

LIB/NETDEV-DPDK.C:88:62: ERROR: ‘ETHER_CRC_LEN’ UNDECLARED (FIRST USE IN THIS FUNCTION)

LIB/NETDEV-DPDK.C:91:40: ERROR: ‘ETHER_HDR_LEN’ UNDECLARED (FIRST USE IN THIS FUNCTION)

LIB/NETDEV-DPDK.C:91:56: ERROR: ‘ETHER_CRC_LEN’ UNDECLARED (FIRST USE IN THIS FUNCTION)

LIB/NETDEV-DPDK.C:933:20: ERROR: ‘ETHER_MTU’ UNDECLARED (FIRST USE IN THIS FUNCTION)

LIB/NETDEV-DPDK.C:1045:23: ERROR: STORAGE SIZE OF ‘ETH_ADDR’ ISN’T KNOWN

This may look intimidating at first, but in fact all the errors are very similar — they’re all about undeclared symbols (with names starting with “ETHER”). Considering OVS compiled successfully with the previous DPDK version, I assumed the problem originated from some DPDK API change.

I searched DPDK 18.11 for files containing these undeclared variables and found /lib/librte_net/rte_ether.h:

#DEFINE ETHER_ADDR_LEN 6 /**< LENGTH OF ETHERNET ADDRESS. */ #DEFINE ETHER_TYPE_LEN 2 /**< LENGTH OF ETHERNET TYPE FIELD. */ #DEFINE ETHER_CRC_LEN 4 /**< LENGTH OF ETHERNET CRC. */ #DEFINE ETHER_HDR_LEN (ETHER_ADDR_LEN * 2 + ETHER_TYPE_LEN) /**< LENGTH OF ETHERNET HEADER. */

#DEFINE ETHER_MIN_LEN 64 /**< MINIMUM FRAME LEN, INCLUDING CRC. */

#DEFINE ETHER_MAX_LEN 1518 /**< MAXIMUM FRAME LEN, INCLUDING CRC. */

#DEFINE ETHER_MTU (ETHER_MAX_LEN - ETHER_HDR_LEN - ETHER_CRC_LEN) /**< ETHERNET MTU. */

#DEFINE ETHER_MAX_VLAN_FRAME_LEN (ETHER_MAX_LEN + 4) /**< MAXIMUM VLAN FRAME LENGTH, INCLUDING CRC. */

#DEFINE ETHER_MAX_JUMBO_FRAME_LEN 0X3F00 /**< MAXIMUM JUMBO FRAME LENGTH, INCLUDING CRC. */

#DEFINE ETHER_MAX_VLAN_ID 4095 /**< MAXIMUM VLAN ID. */ #DEFINE ETHER_MIN_MTU 68 /**< MINIMUM MTU FOR IPV4 PACKETS, SEE RFC 791. */

I opened the same file in version 19.11 — and found the changes:

#DEFINE RTE_ETHER_ADDR_LEN 6 /**< LENGTH OF ETHERNET ADDRESS. */ #DEFINE RTE_ETHER_TYPE_LEN 2 /**< LENGTH OF ETHERNET TYPE FIELD. */ #DEFINE RTE_ETHER_CRC_LEN 4 /**< LENGTH OF ETHERNET CRC. */

#DEFINE RTE_ETHER_HDR_LEN (RTE_ETHER_ADDR_LEN * 2 + RTE_ETHER_TYPE_LEN) /**< LENGTH OF ETHERNET HEADER. */ #DEFINE RTE_ETHER_MIN_LEN 64 /**< MINIMUM FRAME LEN, INCLUDING CRC. */

#DEFINE RTE_ETHER_MAX_LEN 1518 /**< MAXIMUM FRAME LEN, INCLUDING CRC. */

#DEFINE RTE_ETHER_MTU (RTE_ETHER_MAX_LEN - RTE_ETHER_HDR_LEN - RTE_ETHER_CRC_LEN) /**< ETHERNET MTU. */ #DEFINE RTE_ETHER_MAX_VLAN_FRAME_LEN (RTE_ETHER_MAX_LEN + 4) /**< MAXIMUM VLAN FRAME LENGTH, INCLUDING CRC. */ #DEFINE RTE_ETHER_MAX_JUMBO_FRAME_LEN 0X3F00 /**< MAXIMUM JUMBO FRAME LENGTH, INCLUDING CRC. */

#DEFINE RTE_ETHER_MAX_VLAN_ID 4095 /**< MAXIMUM VLAN ID. */ #DEFINE RTE_ETHER_MIN_MTU 68 /**< MINIMUM MTU FOR IPV4 PACKETS, SEE RFC 791. */

We can clearly see that the symbols stayed pretty much the same — with “RTE_” prefix added to their names. This change is documented in version 19.08’s API changes:

“The network structures, definitions and functions have been prefixed by rte_ to resolve conflicts with libc headers”.

To sort this problem out, I wrote a small shell script that changes the netdev-dpdk.c file in OVS’s source code to use the new interface:

#!/bin/bash

sed -i "s/ ETHER_/ RTE_ETHER_/" lib/netdev-dpdk.c

sed -i "s/(ETHER_/(RTE_ETHER_/" lib/netdev-dpdk.c

sed -i "s/ ETHER_/ RTE_ETHER_/" lib/netdev-dpdk.c

sed -i "s/ e_RTE_METER_/ RTE_COLOR_/" lib/netdev-dpdk.c

sed -i "s/struct ether_addr/struct rte_ether_addr/" lib/netdev-dpdk.c

sed -i "s/struct ether_hdr/struct rte_ether_hdr/" lib/netdev-dpdk.c

After doing so, the code compiled successfully and my DPDK 19.11 Open vSwitch was ready to go!

$ ovs-vswitchd --version

ovs-vswitchd (Open vSwitch) 2.12.0

DPDK 19.11.0



$ ovs-vsctl get Open_vSwitch . dpdk_version

"DPDK 19.11.0"

Testing the new switch

After all the compilation we are left with the most important question — is the new version better?

I chose to test my switch’s capabilities in two aspects — traffic rate and latency. More precisely, I conducted two tests:

What’s the best traffic rate I can get through the switch? What’s the average latency in the switch? How does the traffic rate affect the latency?

Both tests took place in a new network I’m setting up. This is a company’s telephony network, which consists of two large telephony switches, a few dozen VoIP phones (which made the tests more realistic), a couple of administration computers and my OVS as the main switch of the network.

I used two of the computers to test the switch’s performance. Also, to gain a broader perspective I compiled my switch with DPDK versions 19.05 and 18.05 (which was easily done, same as 18.11).

Test 1: traffic rate

This test was very straight forward — I used JPerf to find the maximum traffic rate I can generate between the two computers through my switch. The software opens a couple of UDP connections and tries to transfers as much random data as possible. These are the results:

Traffic rate over time

We can see the traffic rate is the highest using the newest version. We can also see that DPDK 18.05 is the slowest, and 18.11 and 19.05 are quite similar.

Test 2: latency

Another important aspect of any network device is latency. This is especially crucial in telephony networks (such as the one I’m doing my tests in), because high latency can damage the calls’ quality.

In this test I slowly ramped up the amount of traffic passing through the switch while measuring the latency. I used JPerf to produce the increasing traffic, while sending PING requests between the computers to measure the response time. Each test ended when I recognized a significant spike in the latency which rendered the network effectively unusable.

Latency over traffic rate

A few interesting notes:

Overall, DPDK 19.11 shows better performance here too. The traffic “threshold”, where the latency spike begins, is at a higher traffic rate in 19.11 compared to the other versions. Moreover, the spike itself is more moderate, which indicates that 19.11 handles traffic-heavy networks better.

Conclusion

Looking at the entire process, I can confidently say that my questions have been answered. I saw that compiling a DPDK-based project (Open vSwitch in my case) with the newest version is very easy, and only required adapting to a few minor API changes. I also tested a couple of versions and found that version 19.11 has better performance, both in traffic rate and latency aspects. Overall, the upgrade seems to be more than worth it!

I really appreciate both DPDK and Open vSwitch for creating these state-of-the-art projects and making them open-source for the whole community to benefit from. I find them both extremely useful and very easy to work with.

I tried to make the process I described here as generic as possible, but it is obviously related to my specific devices and setup. Nevertheless, I’m sure that compiling the new version with any other application will be just as easy. If you encounter any troubles with upgrading — refer to the DPDK’s documentation, specifically to the release notes. It’s very detailed and comprehensive.