Update: I’ve added a link to a page at Royal Holloway describing the new attack.

Listen, if you’re using RC4 as your primary ciphersuite in SSL/TLS, now would be a great time to stop. Ok, thanks, are we all on the same page?

No?

I guess we need to talk about this a bit more. You see, these slides have been making the rounds since this morning. Unfortunately, they contain a long presentation aimed at cryptographers, and nobody much seems to understand the real result that’s buried around slide 306 (!). I’d like to help.

Here’s the short summary:

According to AlFardan, Bernstein, Paterson, Poettering and Schuldt (a team from Royal Holloway, Eindhoven and UIC) the RC4 ciphersuite used in SSL/TLS is broken. If you choose to use it — as do a ridiculous number of major sites, including Google — then it may be possible for a dedicated attacker to recover your authentication cookies. The current attack is just on the edge of feasibility, and could probably be improved for specific applications. This is bad and needs to change soon.

In the rest of this post I’ll cover the details in the usual ‘fun’ question-and-answer format I save for these kinds of attacks. What is RC4 and why should I care? RC4 is a fast stream cipher invented in 1987 by Ron Rivest. If you like details, you can see this old post of mine for a longwinded discussion of RC4’s history and flaws. Or you can take my word: RC4 is old and crummy. Despite its age and crumminess, RC4 has become an extremely popular ciphersuite for SSL/TLS connections — to the point where it accounts for something absurd like half of all TLS traffic. There are essentially two reasons for this:

RC4 does not require padding or IVs, which means it’s immune to recent TLS attacks like BEAST and Lucky13. Many admins have recommended it as the solution to these threats. RC4 is pretty fast. Faster encryption means less computation and therefore lower hardware requirements for big service providers like Google.

So what’s wrong with RC4?

Like all stream ciphers, RC4 takes a short (e.g., 128-bit) key and stretches it into a long string of pseudo-random bytes. These bytes are XORed with the message you want to encrypt, resulting in what should be a pretty opaque (and random-looking) ciphertext.

biases. A few of these biases have been known for years, but weren’t considered useful. However, many more. The problem with RC4 is that the above statement is not quite true. The bytes come out of the RC4 aren’t quite random looking — they have small. A few of these biases have been known for years, but weren’t considered useful. However, recent academic work has uncovered many small but significant additional biases running throughout the first 256 bytes of RC4 output. This new work has systematically identifiedmore.

At first blush this doesn’t seem like a big deal. Cryptographically, however, it’s a disaster. If I encrypt the same message (plaintext) with many different RC4 keys, then I should get a bunch of totally random looking ciphertexts. But these tiny biases mean that they won’t be random — a statistical analysis of different positions of the ciphertext will show that some values occur more often.

By getting many different encryptions of the same message — under different keys — I can tally up these small deviations from random and thus figure out what was encrypted.

If you like analogies, think of it like peeking at a picture with your hands over your eyes. By peeking through your fingers you might only see a tiny sliver of the scene you’re looking at. But by repeating the observation many times, you may be able to combining those slivers and figure out what you’re looking at.

Why would someone encrypt the same message many times?

The problem here (as usual) is your browser.

You see, there are certain common elements that your browser tends to send at the beginning of every HTTP(S) connection. One of these values is a cookie — typically a fixed string that identifies you to a website. These cookies are what let you log into Gmail without typing your password every time.

should be safe. After all, they’ll always be sent over an encrypted connection to the website.Unfortunately, if your connection is encrypted using RC4 (as is the case with Gmail), then each time you make a fresh connection to the Gmail site, you’re sending a new encrypted copy of the same cookie. If the session is renegotiated (i.e., uses a different key) between those connections, then the attacker can build up the list of ciphertexts he needs. If you use HTTPS (which is enforced in many sites by default), then your cookiesbe safe. After all, they’ll always be sent over an encrypted connection to the website.Unfortunately, if your connection is encrypted using RC4 (as is the case with Gmail), then each time you make a fresh connection to the Gmail site, you’re sending a new encrypted copy of the same cookie. If the session is renegotiated (uses a different key) between those connections, then the attacker can build up the list of ciphertexts he needs. To make this happen quickly, an attacker can send you a piece of Javascript that your browser will run — possibly on a non-HTTPS tab. This Javascript can then send many HTTPS requests to Google, ensuring that an eavesdropper will quickly build up thousands (or millions) of requests to analyze.

Probability of recovering a plaintext byte (y axis) at a particular byte position in the RC4-encrypted stream (x axis), assuming 2^24 ciphertexts. Increasing the number of ciphertexts to 2^32 gives almost guaranteed plaintext recovery at all positions. (source)

Millions of ciphertexts? Pah. This is just crypto-wankery. It’s true that the current attack requires an enormous number of TLS connections — in the tens to hundreds of millions — which means that it’s not likely to bother you. Today. On the other hand, this is the naive version of the attack, without any special optimizations. For example, the results given by Bernstein assume no prior plaintext knowledge. But in practice we often know a lot about the structure of an interesting plaintext like a session cookie. For one thing, they’re typically Base64 encoded — reducing the number of possibilities for each byte — and they may have some recognizable structure which can be teased out using advanced techniques. It’s not clear what impact these specific optimizations will have, but it tends to reinforce the old maxim: attacks only get better, they never get worse. And they can get a lot better while you’re arguing about them.

So what do we do about it? Treat this as a wakeup call. Sites (like Google) need to stop using RC4 — before these attacks become practical. History tells us that nation-state attackers are already motivated to penetrate Gmail connections. The longer we stick with RC4, the more chances we’re giving them.

In the short term we have an ugly choice: stick with RC4 and hope for the best, or move back to CBC mode ciphersuites — which many sites have avoided due to the BEAST and Lucky13 attacks.

We could probably cover ourselves a bit by changing the operation of browsers, so as to detect and block Javascript that seems clearly designed to exercise these attacks. We could also put tighter limits on the duration and lifespan of session cookies. In theory we could also drop the first few hundred bytes of each RC4 connection — something that’s a bit of a hack, and difficult to do without breaking the TLS spec.

In the longer term, we need to move towards authenticated encryption modes like the new AEAD TLS ciphersuites, which should put an end to most of these problems. The problem is that we’ll need browsers to support these modes too, and that could take ages.

In summary I once made fun of Threatpost for sounding alarmist about SSL/TLS. After all, it’s not like we really have an alternative. But the truth is that TLS (in its design, deployment and implementation) needs some active help. We’re seeing too many attacks, and they’re far too close to practical. Something needs to give.

We live in a world where NIST is happy to give us a new hash function every few years. Maybe it’s time we put this level of effort and funding into the protocols that use these primitives? They certainly seem to need it.