The CHOKe packet scheduler

LWN.net needs you! Without subscribers, LWN would simply not exist. Please consider signing up for a subscription and helping to keep LWN publishing

A packet on the network typically passes through several machines on the way from its source to its destination. One of those machines (or, more correctly, one of the outbound links from one of those machines) will be the limiting factor on how many packets can traverse that path in a given period of time. If a system tries to send too many packets through the limiting link, the packet queue on the router attached to that link will grow. A growing queue affects other users of that router and will eventually hit its limits, causing packet loss.

The TCP protocol has, for many years, included congestion control algorithms which attempt to determine the carrying capacity of a path and to avoid exceeding that capacity. These algorithms have successfully prevented a repeat of the meltdowns which plagued the early Internet. But congestion control isn't working as well as it should, for a few reasons. Some TCP implementations are more dutiful than others when it comes to congestion control. An increasing amount of traffic on the net uses other protocols (UDP in particular) which do not have congestion control built into them. Excessive queue sizes in routers ("bufferbloat") can also disguise congestion problems until it is too late. All of these problems are motivating a search for better ways of controlling congestion.

The key signal for congestion control on the net is dropped packets; TCP will continue to ramp up its transmit rate until the occasional lost packet makes it clear that the limit has been hit. So the way for a router in the middle of the network to tell a specific sender that it's transmitting too much data is to drop some of that data on the floor. The idea is simple in concept, but it can be harder in practice for a simple reason: network routers can deal with many thousands of packets every second; they cannot afford to spend significant amounts of time on any one of those packets.

An obvious way for a router to schedule packets would be to maintain a queue for every flow (source/destination pair with port numbers) through the system. Packets could then be dequeued and transmitted with absolute fairness, and, any time the queue for a flow gets too long, packets could be dropped from that queue. Implementing this algorithm would require some complex data structures and a fair amount of processor time, though, so it is not an option for a router which handles a significant amount of traffic.

An alternative is the CHOKe algorithm [PS]; CHOKe stands either for "CHOose and Kill" or "CHOose and Keep," depending on one's attitude toward the problem. Stephen Hemminger has recently posted a CHOKe implementation for Linux, so this seems like a good time to look at how this algorithm works.

CHOKe is intended for points where multiple flows come together - routers and bridges, primarily. The idea behind CHOKe is to keep the length of transmit queues under control and to penalize flows with excessive traffic while avoiding the need to maintain any sort of per-flow state. To that end, the packet queuing algorithm works essentially like the following. When a packet arrives for a given outbound link, the CHOKe code will:

Calculate a moving average of the length of the queue. The algorithm includes a parameter for the period over which the average is calculated; a longer period will allow longer load spikes before the algorithm starts CHOKing traffic.

If the average queue length is below a minimum watermark, there is no problem with congestion, so the packet will simply be queued and the job is done.

If the queue length is above the minimum, the CHOKe algorithm picks a random packet from the queue. If that packet belongs to the same flow as the packet under consideration, both packets will be dropped. When a randomly-picked packet comes from the same flow, chances are good that packets from that flow occupy a substantial amount of queue space, so that flow is likely to be a source of the problem.

packets will be dropped. When a randomly-picked packet comes from the same flow, chances are good that packets from that flow occupy a substantial amount of queue space, so that flow is likely to be a source of the problem. If the packets are from different flows, but the queue length is above an administrator-set maximum, then the new packet (only) will be dropped.

In the final case, the algorithm calculates a probability that the packet will be dropped, even though the maximum queue length has not been reached. The probability grows as the queue length increases, but, by default, it remains low - about 2% at the maximum. Thus, for mid-length queues, the algorithm will occasionally send a signal to a transmitter that it should back off a bit, but most packets will be queued normally.

The key feature of CHOKe - the one which distinguishes it from RED (from which it is derived) - is the check against a random packet in the queue. That is a heuristic mechanism for identifying problematic flows without actually having to track what each flow is doing. Experience, as reported in the CHOKe paper, suggest that it works pretty well.

An important factor in successful use of CHOKe in the real world will be careful selection of the controlling parameters: the minimum and maximum queue lengths, average period, and drop probability. In particular, there is mounting evidence (thanks to the efforts by Jim Gettys) that overly long queues lead to all kinds of pathological network behavior, and could even threaten a net collapse at some point. Use of algorithms like CHOKe, combined with reasonably-sized queues, could help keep the Internet working well into the future.

