Summary

Replace the underlying implementations of the java.net.DatagramSocket and java.net.MulticastSocket APIs with simpler and more modern implementations that are easy to maintain and debug. The new implementations will be easy to adapt to work with virtual threads, currently being explored in Project Loom. This is a follow-on to JEP 353, which already reimplemented the legacy Socket API.

Motivation

The code base of the java.net.DatagramSocket and java.net.MulticastSocket APIs, and their underlying implementations, is old and brittle:

The implementations date back to JDK 1.0. They are a mix of legacy Java and C code that is difficult to maintain and debug.

The implementation of MulticastSocket is particularly problematic since it dates back to a time when IPv6 was still under development. Much of the underlying native implementation tries to reconcile IPv4 and IPv6 in ways that are difficult to maintain.

The implementation also has several concurrency issues (e.g., with asynchronous close) that require an overhaul to address properly.

In addition, in the context of virtual threads that park rather than block underlying kernel threads in system calls, the current implementation is not fit for purpose. As datagram-based transports gain traction again (e.g. QUIC), a simpler and more maintainable implementation is needed.

Description

Currently, the DatagramSocket and MulticastSocket classes delegate all socket calls to a java.net.DatagramSocketImpl implementation, for which different platform-specific concrete implementations exist: PlainDatagramSocketImpl on Unix platforms, and TwoStackPlainDatagramSocketImpl and DualPlainDatagramSocketImpl on Windows platforms. The abstract DatagramSocketImpl class, which dates back to JDK 1.1, is very under-specified and contains several obsolete methods that are an impediment to providing an implementation of this class based on NIO (see alternatives, discussed below).

Rather than provide a drop-in replacement for implementations of DatagramSocketImpl , similar to what was done in JEP 353 for SocketImpl , this JEP proposes to make DatagramSocket internally wrap another instance of DatagramSocket to which it delegates all calls directly. The wrapped instance is either a socket adapter created from a NIO DatagramChannel::socket (the new implementation), or else a clone of the legacy DatagramSocket class which then delegates to the legacy DatagramSocketImpl implementation (for the purpose of implementing a backward compatibility switch). If a DatagramSocketImplFactory is installed by an application, the old legacy implementation is selected. Otherwise, the new implementation is selected and used by default.

To reduce the risk of switching the implementation after more than twenty years, the legacy implementation will not be removed. A JDK-specific system property, jdk.net.usePlainDatagramSocketImpl , is introduced to configure the JDK to use the legacy implementation (see risks and assumptions, below). If set with no value or set to the value ”true" at startup, the legacy implementation is used. Otherwise, the new (NIO-based) implementation is used. In some future release we will remove the legacy implementation and the system property. At some point we may also deprecate and remove DatagramSocketImpl and DatagramSocketImplFactory .

The new implementation is enabled by default. It provides non-interruptible behavior for datagram and multicast sockets by directly using the platform-default implementation of the selector provider ( sun.nio.ch.SelectorProviderImpl and sun.nio.ch.DatagramChannelImpl ). Installing a custom selector provider will thus have no effect on DatagramSocket and MulticastSocket .

Alternatives

We investigated, prototyped, and discarded two alternative approaches.

Alternative 1

Create an implementation of DatagramSocketImpl that delegates all its calls to a wrapped DatagramChannel and sun.nio.ch.DatagramSocketAdaptor . Upgrade sun.nio.ch.DatagramSocketAdaptor to extend java.net.MulticastSocket .

This approach showed that it would be relatively easy to provide an implementation of DatagramSocketImpl based on DatagramChannel . Tests were passing, but it also highlighted several limitations:

The security checks were performed twice, once in DatagramSocket , once more in DatagramChannel (or its socket adapter). There were ways to avoid the double security checks but they would have been cumbersome.

The connect emulation implemented at the DatagramSocket level also got in the way, since we didn't want to perform this emulation with the NIO-based implementation.

As with the solution proposed above, the main advantage of this alternative compared to the second alternative below was that no new native code was necessary, since every call could be delegated to DatagramChannel .

While evaluating this alternative, it quickly became apparent that overriding methods at the DatagramSocket level, rather than at the DatagramSocketImpl level, would be simpler and more straightforward, which led to the solution proposed in this JEP.

Alternative 2

Create an implementation of DatagramSocketImpl in the sun.nio.ch package that invokes low-level sun.nio.ch.Net primitives. This allowed the implementation to directly access lower-level NIO primitives instead of relying on DatagramChannel . This was somewhat analogous to what was done for reimplementing Socket and ServerSocket in JEP 353.

The main advantage of this alternative against the first alternative was that it avoided the double security checks, since the implementation could access lower level NIO primitives directly.

However, the new implementation had to replicate the non-trivial state and lock management that DatagramChannel already implements.

It also required the addition of new native code to match the DatagramSocketImpl interface.

The solution proposed in this JEP thus appeared much simpler, less risky, and easier to maintain.

Testing

The existing tests in the jdk/jdk repository will be used to test the new implementation. To ensure a smooth transition, the new implementation should pass the tier2 ( jdk_net and jdk_nio ) regression-test suite and the JCK for java_net/api . The jdk_net test group has accumulated many tests for networking corner case scenarios over the years. Some of the tests in this test group will be modified to run twice, the second time with -Djdk.net.usePlainDatagramSocketImpl to ensure that the old implementation does not decay during the time that the JDK includes both implementations. New tests will be added as required, to expand code coverage and increase confidence in the new implementation.

Every effort will be made to publicize the proposal and encourage developers that have code using DatagramSocket and MulticastSocket to test their code with the early-access builds that are published on jdk.java.net.

The microbenchmarks in the jdk/jdk repository include benchmarks for DatagramChannel . Similar benchmarks for datagram socket will be created if missing, or updated if they already exist, in a way that makes it easy to compare the old and new implementations.

Risks and Assumptions

The primary risk of this proposal is that there is existing code that depends upon unspecified behavior in corner cases where the old and new implementations behave differently. To minimize this risk, some preparatory work to clarify the specifications of DatagramSocket and MulticastSocket , and to minimize the behavioral differences between these classes and the DatagramChannel::socket adapter, has already been done in JDK 14 and JDK 15. Some small differences, listed below, might however persist. These differences might be observable in corner-case situations but should be transparent to the vast majority of API users. The differences we have identified so far are listed here; all but the first two can be mitigated by running with either -Djdk.net.usePlainDatagramSocketImpl or -Djdk.net.usePlainDatagramSocketImpl=true .

Custom APIs or subclasses of DatagramSocket and MulticastSocket that synchronize on instances of these classes may need revisiting, since DatagramSocket and MulticastSocket no longer synchronize on this . Any locking or synchronization is left up to the delegate, which is not accessible outside of the java.net package, and is free to use any mechanism it sees fit.

Similarly, custom classes that extend DatagramSocket or MulticastSocket and override methods such as bind and setReuseAddress won’t have the overridden methods invoked during construction. Anyone doing this is depending on undocumented and implementation-specific behavior.

The new implementation uses the native connect method on all platforms. The legacy implementation still uses an emulation on macOS. This means, in particular, that port-unreachable conditions cannot be detected with the old implementation while they should be detected with the new one. Also, the old implementation will fall back to using the emulation if the native connect fails; the new implementation will report an error instead. In addition, the new implementation will flush the receive buffer at the time of connect, ensuring that any datagram buffered before connect was invoked are discarded. The old implementation used to preserve datagrams that were sent by the connected peer and buffered before the association was performed by the kernel, but the new implementation will simply discard them.

On macOS and Linux, invoking disconnect on the new implementation might require rebinding the underlying socket. This introduces the possibility that the rebinding might fail, and the underlying implementation might throw an exception, leaving the underlying socket in an unspecified state. Whereas the legacy implementation might have silently left the socket in an unspecified state, the new implementation will throw an UncheckedIOException instead.

When joining a multicast group on macOS, if no default outgoing interface has been set, and no outgoing network interface is provided, the old implementation of MulticastSocket::joinGroup will pick up a default network interface and incorrectly attempt to set it as default before joining, by silently setting the IP_MULTICAST_IF option. The new implementation based on NIO will not do this, so the IP_MULTICAST_IF option will never be silently set as a side effect of joining.

The java.net package defines many sub-classes of SocketException . The new implementation will attempt to throw the same exceptions in the same situations as the old implementation, but there may be cases where they are not the same. Furthermore, there may be cases where the exception messages differ. On Windows, for example, the old implementation maps Windows Socket error codes to English-only messages, while the new NIO-based implementation uses the system messages.

Other observable behavioral differences:

A DatagramSocket created via one of its public constructors supports setting options for sending multicast datagrams. The new implementation allows you to configure multicast socket options on base instances of DatagramSocket on all platforms. The old implementation still uses a dual-stack implementation on Windows which doesn't support multicast socket options on base DatagramSocket instances. In that case an instance of MulticastSocket must be used if such options need to be configured.

The new implementation fixes a number of issues, such as 8165653, simply by the virtue of delegating to the NIO implementation, where these issues are not present.

Aside from behavioral differences, the performance of the new implementation may differ compared to the old when running certain workloads. This JEP will endeavor to provide some performance benchmarks to gauge the difference.

Dependencies