Even before the leaks by former NSA sysadmin Edward Snowden, rumours had circulated for years that the agency could decrypt a significant fraction of encrypted internet traffic.

Now security researchers, who published a paper on their theory in May, have come forward with a detailed and credible theory on the technical foundations of this code-breaking capability. They presented a talk last week with a better explanation of how this fitted with the Snowden leaks.

Three years ago, James Bamford published an article quoting anonymous former NSA officials stating that the agency had achieved a “computing breakthrough” that gave them “the ability to crack current public encryption.” The Edward Snowden documents revealed that that the NSA had the ability to intercept and decrypt VPN traffic. The on-demand decryption of some HTTPS and SSH connections was also possible because of unspecified but ground breaking cryptanalysis capabilities, according to the Snowden leaks.

These reports have fuelled speculation in the technical community about backdoors or broken algorithms. Weaknesses in the ageing (but still widely used) RC4 protocol or even flaws in AES have both been suggested as possible explanations.

Earlier this week, the 13-member research team presented a paper at the ACM Computer and Communications Security conference billed as an answer to this technical mystery.

Diffie-Hellman is a cornerstone of modern cryptography used for VPNs, HTTPS websites, email, and many other protocols. Bad implementation choices combined with advances in number theory mean real-world users of Diffie-Hellman are likely vulnerable to state-level attackers, the researchers warned back in May.

The root cause of the cryptographic weakness is that many applications use a standardised or hard-coded prime. This means an adversary can perform a single enormous computation to “crack” a particular prime before breaking any individual connection that uses that prime.

For the most common strength of Diffie-Hellman (1024 bits), the researchers estimated it would cost a few hundred million dollars to build a machine, based on special purpose hardware, that would be able to crack one Diffie-Hellman prime every year.

This is a technical feat on a scale (relative to the state of computing at the time) not seen since the Enigma cryptanalysis during World War II. Yet since only a handful of primes are so widely reused, the payoff, in terms of connections that would then be open to decryption, would be enormous for the likes of the NSA.

The researchers estimate that breaking a single, common 1024-bit prime would allow NSA to passively decrypt connections to two-thirds of VPNs and a quarter of all SSH servers globally. Breaking a second 1024-bit prime would allow passive eavesdropping on connections to nearly 20 per cent of the top million HTTPS websites. In other words, a one-time colossal investment in power-lifting computation would make it possible to eavesdrop on trillions of encrypted connections.

Naturally the NSA could readily afford such an investment, especially given the huge benefits. A 2013 “black budget” request (pdf, via the EFF) states that NSA has prioritised “investing in groundbreaking cryptanalytic capabilities to defeat adversarial cryptography and exploit internet traffic.” The spy agency’s budget is estimated to be around $10bn a year, with over $1bn dedicated to computer network exploitation, and several subprograms in the hundreds of millions a year.

A blog post by two of the researchers (Alex Halderman, an associate professor of computer science and engineering at the University of Michigan and Nadia Heninger, an assistant professor of computer and information science at the University of Pennsylvania) concludes that a Diffie-Hellman prime factor crack fits what’s known about the NSA’s mystery bulk decryption capabilities better than any previously advanced theory.

Based on the evidence we have, we can’t prove for certain that NSA is doing this. However, our proposed Diffie-Hellman break fits the known technical details about their large-scale decryption capabilities better than any competing explanation. For instance, the Snowden documents show that NSA’s VPN decryption infrastructure involves intercepting encrypted connections and passing certain data to supercomputers, which return the key. The design of the system goes to great lengths to collect particular data that would be necessary for an attack on Diffie-Hellman but not for alternative explanations, like a break in AES or other symmetric crypto. While the documents make it clear that NSA uses other attack techniques, like software and hardware “implants,” to break crypto on specific targets, these don’t explain the ability to passively eavesdrop on VPN traffic at a large scale.

Faulty application of Diffie-Hellman is widespread in many standards and implementations. Security weaknesses are built into deployed systems unlikely to replaced for years, even given heightened concern prompted by the latest research. This means that cracking systems will soon be within the range of capabilities of other top-tier intelligence agencies, assuming they aren’t already attacking systems.

The possibility of multiple governments attempting attacks illustrates the conflict between the NSA’s two prime missions of gathering intelligence and defending US computer security. If the researchers are correct then the NSA has been vigorously exploiting weak Diffie-Hellman, while doing little or nothing to help fix the problem. On the defensive side, NSA has recommended that implementers should transition to elliptic curve cryptography, which isn’t known to suffer from this loophole, but such recommendations tend to go unheeded without explicit justifications or demonstrations.

“This state of affairs puts everyone’s security at risk. Vulnerability on this scale is indiscriminate — it impacts everybody’s security, including American citizens and companies — but we hope that a clearer technical understanding of the cryptanalytic machinery behind government surveillance will be an important step towards better security for everyone,” the researchers conclude.

More details of the research can be found in the paper, entitled Imperfect Forward Secrecy: How Diffie-Hellman Fails in Practice (pdf), which received the best paper award at ACM CCS.

The paper was first presented in May so what’s new is that it has now been publicly presented - together with a lot more detail on how the codebreaking might work inside the NSA surveillance architecture and how it fits with NSA rhetoric about its crypto-breaking capabilities in some of its secret documents leaked by Snowden.

“Now that this is out, I'm sure there are a lot of really upset people inside the NSA,” said crypto guru Bruce Schneier in a blog post about the research.

Related research demonstrated how servers that support 512-key “export-grade” Diffie-Hellman can be forced to downgrade a connection to that weaker level by exploiting the so-called Logjam vulnerability.

Additional commentary on the NSA and Weak-DH research I can be found in a blog post by Nicholas Weaver here and Rob Graham here.

Graham calculates that cracking 1024-bit DH it the computational equivalent of 2.5 hours' worth of global Bitcoin mining power.

Steven Bellovin’s blog post, I'm Shocked, Shocked to Find There's Cryptanalysis Going On Here (Your plaintext, sir.), adds a much wider perspective on the break. “People seem shocked about the problem and appalled that the NSA would actually exploit it. Neither reaction is right,” Bellovin argues. ®