The bootx86.efi image is relatively big, about 42M in our case, mainly because of the size of our initrd.img. Needless to say, TFTP is not optimal for downloading such a big file, as it relies on UDP instead of TCP (TCP has more built-in error correction features). Although it should be possible as per the UEFI specification, we have yet to be successful in using HTTP for UEFI Network Boot over IPv6. Until this works correctly, TFTP is a workable solution.

In this configuration, UEFI will download the image, chainload to the vmlinuz in the memdisk (no need to download it separately from the network using Grub), and this kernel will contact the kickstart server and start the configuration of the server.

Anycast for VIPs

While many load balancers support IPv6, we took a different approach from the available options in order to improve scalability. We configure VIPs on servers by adding IPs on the local interface, and then advertise these IPs via BGP sessions using bird and bird6 to the connected switch. Because this is an internal network, we can choose the level of aggregation for the advertised network: /24 for IPv4 and /128 for IPv6. However, you need to be very careful because some network gear may not be able to hold more than 128 routes for /128 prefixes, so it may be a better strategy to reserve a complete /64 even for one IPv6 address used as anycast. This allows us greater flexibility by eliminating the requirement that the servers are in the same network segment as the load balancer.

We are using CFEngine to configure the anycast VIPs and we had to modify some RedHat network scripts to make the configuration similar on IPv4 and IPv6. This is described in the section on CFEngine.

Management

IPMI

IPMI is a very important part of detecting hardware problems before they have noticeable effects, or of being able to query a machine status even when the operating system is down.

Intelligent Platform Management Interface (IPMI) is a standard to access a server without having an operating system installed. It allows an administrator to remotely control and view the screen, for instance, while the server is booting up, or to monitor the health of various hardware components like the hard drive. This access can be “out of band” or “side band.” “Out of band” means that an interface is reserved for this management and is in its own network. In a “side band” design, the same physical interface is used for normal network traffic and management in separate networks.

Because management may happen before any operating system is installed, IPMI must be able to be automatically configured on an IPv6-only network. Later on, once the operating system is installed, ipmitools on the server will allow the user to query and set hardware components.

Ipmitool on Linux has supported IPv6 since version 1.8.14, which was released in May 2015. However, it still needs to find its way into some Linux distros. For instance, the right version is not yet present in RedHat 7, so we need to create our own build.

In addition, you need a network card with the appropriate firmware to support IPMI IPv6 extensions. Such a network card would also likely be able to do UEFI and PXE over IPv6. We have been trialling a Redfish-compatible firmware as specified by the Distributed Management Task Force with some of our vendors.

For long-term support, administrators may have to update the Linux version currently running to a supported version with complete IPv6 support; this may have downstream impacts, not to mention the difficulty of flashing firmware.

CFEngine

Another part of managing servers is being able to configure them. There are many tools available: Chef, Puppet, Foreman, etc. We are using CFEngine. While CFEngine works over IPv6, the software has built-in IPv4 support functions for configurations that are not yet implemented for IPv6. We created a CFEngine module that would gather IPv6 information from the host and allow us to feed this into any CFEngine Sequence we wanted to. This module is saving us a tremendous amount of time for configuring servers with IPv6, setting up the network interface, creating the right iptables, updating software configuration files with IPv6 support, etc.

The automation method we use relies heavily on Linux virtual interfaces. This required us to patch network-scripts on Linux to make them manageable by CFE. The difficulty is with additional IP addresses on an interface. With IPv4, this is handled via virtual interfaces config natively, while for IPv6, this is just a variable containing all additional IPs. Our patch allowed us to use virtual interfaces with IPv6. We contributed our modifications to RedHat for consideration.

IPv6 support for CFE is tracked here, on a support ticket that describes minor bugs. CFE developers have reported that the software works in IPv6-only environments. For us, it is working well on a dual-stack environment, and we see IPv6 traffic, but we have not yet verified it indeed works on an IPv6-only environment. We have always worked very closely with the CFE developers to provide feedback on our experience, and they are very responsive—some of this feedback finds its way quickly into future releases, so if there are any issue we find, we are confident they will be promptly addressed.

Other configuration systems have various levels of support for IPv6. What is important is to test them in real-life scenarios and provide feedback so that each system is always improving.

Monitoring (and metrics)

Over the years, we have built our own monitoring and alerting services, called inGraphs and AutoAlerts, described here.

This software provides a framework for various collectors to get metrics from a server, a network device, an application, or a service, and to send this information to be graphed. This process may also generate alerts if the data is out of conformance from what is expected (hard drive full, CPU usage, too much traffic on an interface, too many requests to a service, etc.). We monitor many metrics per device to allow us to understand trends, and daily and weekly cycles, as well as to inform us when something is getting bad before it is really bad. These metrics are also useful for post-mortem analysis.

inGraphs and AutoAlerts are just a framework for us to deploy scripts and probes that will help us measure the differences between our operations and services, be it on IPv4 or IPv6. Any organization will do the same and build into its own monitoring system new graphs and alerts to inform when a service is performing differently depending on the protocol stack.

As we transition from IPv4-only to a dual-stack environment, and ultimately to IPv6 only, we need to have metrics that allow us to understand if a service is operating correctly over IPv4 and IPv6. Do we observe a difference in terms of latency, queries per second, etc.? Although we may not make a change to the product itself, this means having tools and instrumentation that collect the data needed to check performance on IPv4 and IPv6. For instance, we found that some network equipment might offer traffic breakdowns for IP, ICMP, TCP, or UDP, but they do not tell us the split between IPv4 and IPv6 traffic. This is a much-needed metric to measure progress and success and will be essential when the time comes to remove IPv4.

A large company like LinkedIn cannot move to IPv6 without measuring many metrics to understand if issues we discover are due to a difference between our implementation of IPv4 and IPv6. Sometimes changes are not directly obvious, and sometimes they are very small in terms of percentage, but are still important—for instance 0.1% of 400 million members is still 400,000 members.

Decommissioning

Eventually you will need to decommission machines. This requires removing them from operations, wiping out the data, removing them from the network and, finally, unracking them. And in an IPv6-only environment, the software needs to be able to do these operations on an IPv6-only network.

Planning early how the decommissioning will happen, while still architecturing the provisioning of machines, will help you in the future. This is something we have put in place on IPv4 and are looking into for how to execute on an IPv6-only environment.

The elephant in the room (software support)

There are many pieces of software that need to be IPv6-capable. A large number of deployed devices and software in many organizations can be related to the deployment of a Hadoop grid. We will take Hadoop as an example (as this usually involves a large deployment of devices in many organizations) but then look at other software and strategies as well.

Hadoop

Many large companies have to store and analyze big data. One of the common systems to use for this today is Hadoop. This usually requires lots of servers for data storage and computation. Today, Hadoop does not support IPv6. The community has created a development fork to to get IPv6 support, and once this fork is tested, it will be merged back into the main branch. This is how the Hadoop community tends to work when new features are added. It gives a safe developing environment for the feature and the main branch.

Engineers at Facebook have been contributing to the effort to add IPv6 support to Hadoop, and we are looking at how we can participate in this effort. The project is progressing well so far. This is awesome work from all involved with Hadoop, but needless to say, like in all open source projects, we believe the community always welcomes more help.

Once Hadoop supports IPV6, we could easily have large deployments of IPv6-only machines in many organizations, saving millions of IPv4 addresses for each organization.

We are looking forward to be able to test Hadoop with IPv6 support and report and help fix bugs as we are finding them.

Other software

Many tools and software need to be modified in order to handle IPv6 and its data structure. For regular socket connections, there are a few strategies to take into consideration. By default, the operating system will return the preferred IP address for a hostname (IPv4 or IPv6). This preference is usually IPv6 when the device has a global IPv6 address on one of its interfaces. However this preference can be changed to always prefer IPv4. The other strategy is to get all the addresses of a hostname, and deal with them in the software in a way that would be different from the operating systems. For instance, Java has a couple of flags to set so that you can enable connections over IPv6. For more examples, you can find a good tutorial on Python and sockets here.

This is just one example of how there is still some software that is not yet ready for IPv6. Disabling IPv6 in the operating system is still a frequent solution in this software for any problems with IPv6 support! This is a pity, because it does not help solve the real problems.

In some situations, instead of fixing the needed software to work with IPv6, a task that may depend on an external third party, you can provide a wrapper or proxy around it. For instance, Apache in proxy mode or Nginx have both been extensively used as IPv6 frontends to IPv4-only backends. This is how many sites have been offering connectivity over IPv6. This can also be used for internal sites, making people more familiar with IPv6 and increasing IPv6 traffic internally, therefore increasing visibility so that people know that coding for IPv4 only is no longer an option.

Preparing applications for IPv6

The American Registry for Internet Numbers (ARIN) has published this documentation on how to ensure developers are making network-agnostic code.

Next steps

There is still some road left ahead, but we are on the cusp of full IPv6 support for all devices and software. There is no longer any domain that is not affected by the need to get IPv6. It is no longer the work or the need of a few—many are recognizing it.

Migrating all our software to connect over IPv6 is the task we are now tackling over the next few months. We will also continue to work on the way we provision devices. Once this is done, it will be time to start to dispose of IPv4.

Acknowledgements

In this part we would like to acknowledge some of the external people that helped us in this endeavour. There are many more, and apologies for the ones we have forgotten or who could not be named: Fred Baker, John Brzozowski, Vint Cerf, Lorenzo Colitti, Jason Fesler, Lee Howard, Pradeep Kathail, Martin Levy, Christopher Morikang, Paul Saab, Mark Townsley, Eric Vyncke, Dan Wing, and Jan Zorz.