The return of nftables

Please consider subscribing to LWN Subscriptions are the lifeblood of LWN.net. If you appreciate this content and would like to see more of it, your subscription will help to ensure that LWN continues to thrive. Please visit this page to join up and keep LWN on the net.

Some ideas take longer than others to find their way into the mainline kernel. The network firewalling mechanism known as "nftables" would be a case in point. Much of this work was done in 2009; despite showing a lot of promise at the time, the work languished for years afterward. But, now, there would appear to be a critical mass of developers working on nftables, and we may well see it merged in the relatively near future.

A firewall works by testing a packet against a chain of one or more rules. Any of those rules may decide that the packet is to be accepted or rejected, or it may defer judgment for subsequent rules. Rules may include tests that take forms like "which TCP port is this packet destined for?", "is the source IP address on a trusted network?", or "is this packet associated with a known, open connection?", for example. Since the tests applied to packets are expressed in networking terms (ports, IP addresses, etc.), the code that implements the firewall subsystem ("netfilter") has traditionally contained a great deal of protocol awareness. In fact, this awareness is built so deeply into the code that it has had to be replicated four times — for IPv4, IPv6, ARP, and Ethernet bridging — because the firewall engines are too protocol-specific to be used in a generic manner.

That duplication of code is one of a number of shortcomings in netfilter that have long driven a desire for a replacement. In 2009, it appeared that such a replacement was in the works when Patrick McHardy announced his nftables project. Nftables replaces the multiple netfilter implementations with a single packet filtering engine built on an in-kernel virtual machine, unifying firewalling at the expense of putting (another) bytecode interpreter into the kernel. At the time, the reaction to the idea was mostly positive, but work stalled on nftables just the same. Patrick committed some changes in July 2010; after that, he made no more commits for more than two years.

Frustrations with the current firewalling code did not just go away, though. Over time, it also became clear that a general-purpose in-kernel packet classification engine could find uses beyond firewalls; packet scheduling is another fairly obvious possibility. So, in October 2012, current netfilter maintainer Pablo Neira Ayuso announced that he was resurrecting Patrick's nftables patches with an eye toward relatively quick merging into the mainline. Since then, development of the code has accelerated, with nftables discussion now generating much of the traffic on the netfilter mailing list.

Nftables as it exists today is still built on the core principles designed by Patrick. It adds a simple virtual machine to the kernel that is able to execute bytecode to inspect a network packet and make decisions on how that packet should be handled. The operations implemented by this machine are intentionally basic: it can get data from the packet itself, look at the associated metadata (which interface the packet arrived at, for example), and manage connection tracking data. Arithmetic, bitwise, and comparison operators can be used to make decisions based on that data. The virtual machine is capable of manipulating sets of data (typically IP addresses), allowing multiple comparison operations to be replaced with a single set lookup. There is also a "map" type that can be used to store packet decisions directly under a key of interest — again, usually an IP address. So, for example, a whitelist map could hold a set of known IP addresses, associating an "accept" verdict with each.

Replacing the current, well-tuned firewalling code with a dumb virtual machine may seem like a step backward. As it happens, there are signs that the virtual machine may be faster than the code it replaces, but there are a number of other advantages independent of performance. At the top of the list is removing all of the protocol awareness from the decision engine, allowing a single implementation to serve everywhere a packet inspection engine is required. The protocol awareness and associated intelligence can, instead, be pushed out to user space.

Nftables also offers an improved user-space API that allows the atomic replacement of one or more rules with a single netlink transaction. That will speed up firewall changes for sites with large rulesets; it can also help to avoid race conditions while the rule change is being executed.

The code worked reasonably well in 2009, though there were a lot of loose ends to tie down. At the top of Pablo's list of needed improvements to nftables when he picked up the project was a bulletproof compatibility layer for existing netfilter-based firewalls. A new rule compiler will take existing firewall rules and compile them for the nftables virtual machine, allowing current firewall setups to migrate with no changes needed. This compatibility code should allow nftables to replace the current netfilter tables relatively quickly. Even so, chances are that both mechanisms will have to coexist in the kernel for years. One of the other design goals behind nftables — use of the existing netfilter hook points, connection-tracking infrastructure, and more — will make that coexistence relatively easy.

Since the work on nftables restarted, the repository has seen over 70 commits from a half-dozen developers; there has also been a lot of work going into the user-space nft tool and libnftables library. The kernel changes have added missing features (the ability to restore saved counter values, for example), compatibility hooks allowing existing netfilter extensions to be used until their nftables replacements are ready, many improvements to the rule update mechanism, IPv6 NAT support, packet tracing support, ARP filtering support, and more. The project appears to have picked up some momentum; it seems unlikely to fall into another multi-year period without activity before being merged.

As to when that merge will happen...it is still too early to say. The developers are closing in on their set of desired features, but the code has not yet been exposed to wide review beyond the netfilter list. All that can be said with certainty is that it appears to be getting closer and to have the development resources needed to finish the job.

See the nftables web page for more information. A terse but useful HOWTO document has been posted by Eric Leblond; it is probably required reading for anybody wanting to play with this code, but a quick, casual read will also answer a number of questions about what firewalling will look like in the nftables era.

