There are some real problems in DNS, related to the general absence of Source Address Validation (SAV) on many networks connected to the Internet. The core of the Internet is aware of destinations but blind to sources. If an attacker on ISP A wants to forge the source IP address of someone at University B when transmitting a packet toward Company C, that packet is likely be delivered complete and intact, including its forged IP source address. Many otherwise sensible people spend a lot of time and airline miles trying to improve this situation — I want to shout out to Paul Ferguson and Daniel Senie for their excellent work on BCP38 about 15 years ago. My own modest contribution is SAC004, a pointy-haired-boss (PHB) description of the problem, which just had its ten year anniversary during which we all agreed — rather morbidly — that the problem had gotten nothing but worse in the ten years since publication.

The problems created for the Domain Name System (DNS) by the general lack of SAV are simply hellish. DNS is both fragile and dangerous because of the general lack of SAV — by which I mean that DNS can be injured easily and that it is easy to use DNS as a weapon to injure others. The main problems related to the general lack of SAV are:

1. Indirect packet-bombing. Since anyone can impersonate anyone else at the Internet's most fundamental packet-layer level, a DNS server who receives a stream of requests from some impersonated victim will answer those requests, therefore transmitting to the victim some response traffic they did not solicit and can neither avoid nor shut off. Given recent advances in DNS such as DNSSEC, responses are a lot larger than they used to be — so the responses received by the victim can currently be up to 70 times the size of the requests originally forged by the attacker. The ratio between the size of the forged request and the unsolicited response is called the "amplification factor". Solution: DNS Response Rate Limiting (RRL), created in 2012 by Vernon Schryver and Paul Vixie, allows a server to loosely keep track of repeated queries and avoid answering query flows that would not have come from a legitimate client. The decision criterion is that a legitimate client would have stopped asking when they got the answer we sent earlier. We still need universal SAV, but RRL gives us some breathing space. 2. Datagram related cache poisoning. Since anyone can impersonate anyone else at the Internet's most fundamental packet-layer level, a DNS questioner who receives a stream of almost-matching answers to one of their outstanding questions will quietly and efficiently sort through that stream, waiting for a response having the correct 16-bit transaction ID (TXID) as well as a correct 16-bit UDP source port. An attacker adds their responses to this stream, making guesses at the 16-bit TXID and 16-bit UDP port. In less time than we'd like, the law of averages gives the attacker their "in", and their poisonously wrong answer will have the right pair of 16-bit identifying marks. The questioner in this scenario is an ISP or university or company name server answering for a large local population, so, the poisonously wrong answer will be cached and shared with the local population. Solution: UDP Source Port Randomization (SPR), invented by Dan Bernstein and brought to bear on this problem by Dan Kaminsky. SPR expands the size of the random target from 1-in-65,000 to 1-in-7,000,000, and lengthens the average successful attack from "minutes at 100Mbit/sec" to "days at 100Mbit/sec". Note that SPR is only a band-aid — the real fix for all forms of cache poisoning is DNSSEC. 3. Fragmentation related cache poisoning. Modern DNS depends on large UDP datagrams that will not necessarily fit into a single packet on the wire. The Internet handles this by splitting the UDP datagram into multiple IP packets called "fragments", where the first fragment has the front of the datagram, the middle fragments if any have more of the datagram, and the last fragment has the end of the datagram. The security problem created by this is that the 16-bit DNS transaction ID (TXID) and 16-bit UDP source port are only present in the first fragment. The tie that binds all these fragments together so that they can be reassembled at the destination is a 16-bit IP ID. Our earlier experiences tell us that because SAV is not widely deployed, attackers can forge the IP source address of any packet, including faked middle and final fragments. And they only need to guess a 16-bit number in this case, which means their target is 1-in-65000 again, just like DNS itself was back before we deployed SPR. Solution: stop using fragmentation for DNS, unless the message is signed with DNSSEC or Transaction Signatures (TSIG). This is a nice neat solution, since the main reason we need large UDP datagrams in modern DNS is because of DNSSEC. Fragments are not a concern when data is crypto-authentically signed by DNSSEC or TSIG, because forging a middle or final fragment will result in a reassembled datagram with bad signatures.

DNS-based packet-bombing attacks are all the rage. They are often directed at web servers or IRC servers or gaming servers, but just as often at other DNS servers. Some of the attacks are motivated by anger or amusement, and others by ransom or protection demands. "Nice online gambling establishment you've got there… it'd be a real shame if something bad happened to it."

SAV, where deployed, stops DNS packet-bombing attacks, and also stops off-path DNS poisoning attacks. However, SAV has to be deployed on the attacker's network, and is therefore not a viable defense. Things could improve if a lot of operators all over the Internet decided to make common cause with each other and all do the right and necessary thing by deploying SAV locally even though the beneficiaries would only be users of other networks, and no one could catch them if they cheated. This outcome seems unlikely but several of us will keep on trying to get the word out about it.

DNSSEC, where deployed, stops all known DNS poisoning attacks, on-path or off-path. However, it has to be deployed both by a domain owner and by name server operators. There are tens of millions of old DNSSEC-incapable name servers out there, all outside the domain owner's influence. We are at the time of this writing sixteen years into the DNSSEC effort, and universal deployment does not seem near, but several of us will keep trying to get the word out about it.

These are real problems that your colleagues in the Internet engineering and operations worlds are wrestling with every week or so, if not every day. There are however weaker problems, less real, often spoken of, sometimes reported, but also disputed. A widely circulated vulnerability report in recent weeks has occasioned this article, and before we get to less well known wrong-headed beliefs about what's wrong with the DNS and how to fix them, let's get right at the headlines.

RRL Slip Frames

It's been observed, correctly, that DNS Response Rate Limiting (RRL) interacts poorly with UDP Source Port Randomization (SPR) as a fix for Kaminsky-style DNS cache poisoning. What happens is that each response which is deliberately dropped by RRL lengthens the time window during which an attacker can flood a questioner with possibly-matching answers to outstanding question. Simply put, under RRL, the lifetime of a question can be ~30 seconds, whereas without RRL, the lifetime is ~30 milliseconds. Because of the law of averages, this ~1000x increase in the length of the time window yields a corresponding improvement in attack effectiveness. The proposed solution with which RRL's creators do not agree is to make the default RRL "slip" value be 1 rather than 2. In this configuration there would be no dropped responses, only Slip frames (TC=1 responses) urging a questioner — if there is one — to retry with TCP.

This proposal causes more problems than it solves, and is in fact unnecessary, because the problem described is quite livable. Here are the details.

First let's consider the impact on the packet-bombing victim if a reflecting name server uses RRL with "slip=1" such that there are no dropped responses, only Slip frames. This victim will see a drop in bits per second but no drop in packets per second. This is because Slip frames are smaller than real answers, in fact Slip frames are identical in size to the question and so a "slip=1" proponent might validly call this configuration attenuative — the attack heard by the victim will have fewer bits in it thanks to RRL. We are however directly aware of a vast number of routers, switches, servers, name servers, firewalls, and other on-path devices whose principle bottleneck is packets not bits. That is, these devices might be able to receive or forward five hundred megabits per second (500 Mbit/sec) of large packets but only a fifty megabits bits per second (50 Mbit/sec) of small packets. This is weak engineering on their part but we don't get to judge the manufacturers or the operators of these weak devices — we must take them into account when planning our defense. The creators of DNS RRL did take account of these common limitations, and our conclusion was, RRL must be attenuative in packets per second, not just in bits per second, in order to serve its intended purpose. Real operational experience has shown that "slip=2" makes a server unattractive as a denial-of-service reflector, whereas not so "slip=1".

Second let's examine the real client who would like to make a real query during a packet-bombing attack in which their IP address is being forged in query storms sent to an interesting authority name server who has enabled RRL. This client (i.e., victim) is hearing whatever reflective debris results from the attack, like a large number of unsolicited responses. That debris may be enough to saturate the path from the reflecting server to the victim, but let's assume for a moment that there's enough capacity for the victim to ask a real question and get a real answer even in the midst of this storm. So the victim will not only hear the attack flow but will have its real and legitimate questions swept up in the resulting RRL dragnet. Where "slip=2" as the RRL designers recommend, the victim will see a mixture of dropped responses and Slip frames. When it sees a drop it will retry with UDP, whereas when it sees a Slip it will retry with TCP. This mix of retry type, roughly half UDP and half TCP, is the best case scenario, because the victim has a great chance of acquiring its real answer after which it will stop asking the attack-similar question. A pure TCP fallback strategy would be less reliable due to the fragility of TCP/DNS, about which, more will be said below. Since RRL's goals are to both avoid congestion and preserve content reachability, the default "slip=2" really is far better than "slip=1".

Finally let's look at detectability. It's been observed that an authority server who enables RRL with the default "slip=2" configuration will see its names more vulnerable to Kaminsky-style DNS poisoning attacks. The general increase is from "days of 100 Mbit/sec blast" to "hours of 100 Mbit/sec blast". This increase cannot be simply ignored, but on consideration of the attack surface, we find that a recursive name server that is unmonitored to the extent that a fat 100 Mbit/second blast can go unnoticed for many hours, is a recursive name server that has bigger problems of its own, and creates bigger problems for all of us than increased susceptibility to DNS poisoning. We urge the operators of such recursive servers to either close their server to public access, or to install a firewall between their server and and the rest of us, and in either case to please deploy DNSSEC validation. We do not believe that there are a lot of authority server operators lying awake nights right now worrying about the difference between "slip=2" and "slip=1" — because they are too busy lying awake nights thinking about the tens of millions of recursive name servers that are either open to the public Internet, or which have not been patched for SPR, or both. The "slip=2" problem, if any, is specific to certain names, and still requires many hours of uninterrupted 100 Mbit/sec blasting from the attacker to the victim in order to have a chance at success. This level of threat is beneath concern for Internet infrastructure operators.

Use of TCP

The designers, implementers, and operators of DNS infrastructure are often exhorted, "why don't we just use TCP?" The attraction of TCP is obvious — it is not susceptible to SAV attacks. A TCP packet whose IP source address is forged has no impact on anybody, since the attacker's inability to hear the victim's response prevents TCP from "starting up" and from consuming any server resources. However, the reasons not to use TCP are just as obvious. DNS uses UDP by default, and TCP as a fallback, and this design element was in no way accidental, and is not subject to change at this late date. Let's explore the reasons.

First there's total transaction time. DNS/UDP is a single round trip protocol, a question goes out, the answer comes back. Even if the answer is fragmented and therefore contains several packets, those packets will be minimum spaced, back to back, thus fitting into a single round trip time (RTT). TCP by comparison is a 3xRTT protocol, requiring a minimum of three round trips to exchange a question and an answer. The SYN goes out, the SYN+ACK comes back, an ACK+question goes out, then an ACK+response comes back, a FIN comes back, a FIN+ACK goes out, and finally a FIN comes back. With even moderate RTT's measured in the 50 millisecond to 100 millisecond range, DNS/TCP has far lower throughput than DNS/UDP simply because of the speed of light and the laws of physics and the number of round trips involved. It's reasonable to expect a small 1U Linux rack-mount server to handle 100 Kq/sec of pure DNS/UDP but only 5 Kq/sec or less when tested with pure DNS/TCP.

It's been argued that the extra round trips of using TCP for DNS can be amortized over many queries, thus if you leave the TCP sessions open you can send many queries and receive many answers at a rate closer to one transaction per RTT. This observation is not without merit, but it does not hold up. DNS servers especially older ones are typically limited to a few dozen open TCP sessions per server, whereas the number of high volume flows handled by a busy name server is in the thousands or tens of thousands. This is, at least, an orders-of-magnitude problem. But in addition, we have the problem of the DNS specification, which specifies that the initiator will close the TCP session when its work is complete and that a server shall not unilaterally close such sessions even for resource exhaustion reasons without first waiting about 30 seconds. This makes such servers vulnerable to trivial TCP exhaustion attacks, where any attacker can at very low cost acquire and hold all of a server's TCP resources. Since many attackers have denial of service as one of their goals, we as defenders know that any strategy which urges more transactions toward TCP dependency, is itself too fragile.

A new protocol could be designed that would not have these problems, or indeed a change could be published to the DNS specification which gave us a more robust session handling mechanism for DNS/TCP. If not for the tens of millions of existing servers who will behave in the old way, and the estimated two decades before a change like this could reach critical mass, the idea of fixing DNS/TCP would have some merit.

Query Type ANY

Many recent packet-bombing attacks have used query type ANY in order to incite a reflecting server to respond with a large (amplified) answer. Query type ANY was designed for diagnostic purposes and has no real operational use, and so, some inventive server operators have modified their name servers to drop queries of type ANY. This is short sighted in two ways. First, it is not necessary to use query type ANY to get a server to send a large answer — many other query types such as TXT or even MX are capable of generating large answers that provide excellent amplification. In addition, the advent of DNSSEC means that all answers are far larger than they used to be, especially for negative answers which are DNSSEC's largest kind.

Even more importantly, security is a matter of economics. Attackers and defenders are trying to drive their opponent's costs up while driving their own costs down — that's the game. This move, where a server operator modifies their name server to drop queries of type ANY, is exactly wrong. Any attacker who is in the least way inconvenienced by this change can simply change their attack to use a different query type, at which point the defense has no easy next step. Secure defense must be designed according to the attacker's resulting alternatives and the defender's resulting costs for each of those.

State Tension

There is a necessary tension between performance and safety. DNS is an incredibly large and busy global system involving hundreds of millions of agents and many billions of transactions every day. This system scales because it uses UDP, a stateless transport protocol that requires only one round trip per transaction and no inter-transaction state. As a result of DNS's primary dependency on UDP, and due to the lack of universal deployment of SAV, DNS's extreme performance comes along with extreme fragility and extreme danger. If we want less fragility and less danger then we're going to have to add state somewhere. RRL adds opportunistic light weight state to servers, allowing them to avoid serving as reflecting amplifiers in today's common packet-bombing attacks. TCP adds required heavy weight state, at a cost far higher than we can accept given DNS's massive size and transaction load. Other forms of state are possible, and Donald Eastlake proposed an opportunistic medium weight method in 2007 that's worth another look: DNS Cookies.

In Eastlake's DNS Cookie proposal, a requestor can include a large random number in the clear text of a request, and a responder can echo this back and append its own large random number in the clear text of a response. There is no privacy or secrecy offered. Once each side has proved to the other that they are adding these random numbers to their messages, each other-side can opportunistically drop any message lacking the correct random number for that endpoint. This is problematic due to NAT, where the nature of an endpoint is no longer simply "the IP address you and they both think they are using", but that's a detail. There are other details which also need work, but what's plain by now is that the conclusion reached in 2007 was wrong. The IETF DNS Extensions Working Group (DNSEXT) determined in 2007 that this proposal was too complex for its use case. What we know now that we did not know then is that the use case is every DNS server and every DNS transaction. Let's reconsider, noting that the roll-out can be incremental — there's no flag day and no fork lift.

Conclusion

Secure design isn't just preventing the problems you can think of or that you're having. It also doesn't mean preventing problems that you've heard anecdotally that other people have thought of or might be having. Security is about economics. One design is more secure than another if its risks and risk related costs are lower, which means some thought has to be given to the costs an attacker would have. Changing a design for security reasons requires careful cost analysis of the defender's alternatives, and careful benefit analysis of what the attacker's alternatives would be. Goodness in defense comes from reducing the defender's costs and possibly raising the attacker's costs.

More importantly, the Internet is 30 years old and yet just beginning. Our designs should take the past into account and should learn from present day experiences, but we must look primarily to the future. The largest part of the cost:benefit curve, and the greatest area under same, is in the future. When we push for solutions to today's problems we run the risk of re-fighting the previous war and also the risk of driving the defender's costs up for no good reason because we didn't change the attacker's costs enough.

A DNS authority server operator whose intent in deploying RRL is to make their servers less attractive as reflectors for packet-bombing attacks will want the default Slip value of "2" since this attenuates both bits and packets. Any change in the risk of that authority's names being poisoned as a result of RRL will be so small as to be of academic interest only.