A group of MIT researchers has unveiled a machine learning approach to TCP congestion which could form the foundation of the next round of improvements to the venerable protocol's performance.

Dubbed “Remy”, their TCP control software is based on the idea that even sophisticated modern congestion control algorithms (like Compound TCP in Windows or Cubic in Linux) aren't flexible enough to cope with increasingly complex networks.

Instead, Professor Hari Balakrishnan, Fujitsu Professor of Electrical Engineering & Computer Science at MIT, believes it's better to set computers to the task of identifying what TCP settings work best under particular conditions.

Their work, pre-publication version here, appears to show that by replacing manually-generated congestion control with Remy, networks could achieve far better performance than any of the current TCP congestion control algorithms.

The idea is that a subnetwork that's got a high-capacity fibre on the other side of the router is going to have completely different congestion behaviours to one that's connected over a 3G wireless connection. For example, the naturally-higher latency of a wireless connection can look like congestion to an endpoint, because of its slow ACK times.

The fundamental problem the MIT group is trying to solve: TCP has a limited network model. “For example,” they write, “because TCP assumes that packet losses are due to congestion and reduces its transmission rate in response, some subnetwork designers have worked hard to hide losses. This often simply adds intolerably long packet delays.”

“We believe that the best way to approach this question is to take the design of speciﬁc algorithmic mechanisms out of the hands of human designers (no matter how sophisticated!), and make the end-to-end algorithm be a function of the desired overall behaviour,” they continue.

Describing TCP behaviour in terms of game theory, the MIT researchers write that the best thing any endpoint can do with a packet, at any given moment, is to send it – and if every endpoint simply hands its packet to the network, the network collapses into congestion.

Remy is designed to work on a subnetwork basis – that is, all endpoints in a subnet are running Remy. Hence, for example, on a home network, Remy's aim would be to limit local congestion by having the hosts respond in the same way to that congestion.

To do this, Remy expresses the sender's state as a function of the arrival time of acknowledgements from the far end (using an exponentially weighted moving average, EWMA); the timestamps on those acks (also weighted as EWMA); and the ratio between the most recent packet RTT and the minimum RTT seen in a session.

The system then builds a table of rules for its subnetwork, iteratively adjusting congestion behaviours until a best-case is reached under given conditions.

They've released the code for Remi at github. ®