Background

The Attack

RSA is used in in the process of encrypting data to send across a network. The server transmits its RSA public key to the client as a part of an SSL or TLS handshake. Part of the RSA public key contains the modulus n = p * q, where p and q are two randomly chosen primes of similar size. Primes p and q are kept secret, as knowing these values allows the private key to be calculated.

Ensuring that p and q are selected with sufficient randomness is a crucial component of keeping the public key secure. Factoring a large modulus n to obtain p and q is not feasible under normal circumstances. However, if keys are generated with poor randomness, then it becomes a concern that two public keys will share a factor once enough keys are generated. If two RSA moduli n 1 and n 2 share precisely one prime factor p, then computing the Greatest Common Divisor (GCD) of n 1 and n 2 will reveal the value of p. The GCD computation is significantly easier than straightforward factoring, and can easily be performed in practice. The other factors of n 1 and n 2 can then trivially be found by the simple calculations n 1 ⁄ p and n 2 ⁄ p, respectively, compromising both keys. This GCD computation can be scaled to analyze all pairs of keys in sub-quadratic time in the number of keys [1].

Selecting the prime factors of appropriate size with uniform randomness should prevent two moduli from ever sharing a factor in practice. However, if there is a flaw in the random number generation when choosing primes, a collision is likely in a sufficiently large dataset. Attackers can use this knowledge to collect a large number of RSA public keys and then look for GCDs between their moduli to search for factors shared by any pair.

Previous research teams have performed large scans on the Internet to demonstrate and investigate the implications of this attack. The attack received significant attention in 2012 and researchers were able to break tens of thousands of keys. There was a follow up on the attack in 2016 on a significantly larger data set, with a focus on trends in occurrences of weak keys from various vendors. We focused on three previous publications [1], [2], and [3] in developing our analysis.

Lenstra 2012

The authors of [2] collected approximately 6.2 million digital certificates across the Internet and found that approximately 4.3% of these certificates fully share their RSA modulus with another. The authors clustered the certificates together by modulus to further examine this duplication. Notably, single clusters of sizes 16489, 8366, 6351, and 5055 were found, and 58913 of the clusters have precisely two certificates. Additionally, they found that 12934 of the distinct RSA moduli were able to be factored. The authors found 1200 factors that are shared by more than one modulus. The most widely used individual factors were shared by up to 4627 certificates.

Heninger 2012

In [1], researchers collected 5.8 million unique TLS certificates and 6.2 million unique SSH host keys. Between these two collections, they found 11 million distinct RSA moduli and were able to factor 16,717 distinct public keys. This led to breaking 23,576 (.4%) of their TLS certificates and 1,013 (.02%) of the RSA SSH host keys.

The authors investigated the causes of moduli sharing factors by examining random number generation implementations. A major issue in the flawed random number generation was traced back to a low amount of entropy in the random number pool when the keys were generated. This issue most commonly arises on lightweight devices that do not have much input from which to gain entropy.

This analysis found that only two moduli on publicly trusted certificates could be factored, and both certificates had expired. The large majority of the broken keys came from network devices such as routers, modems, or firewalls.

Hastings 2016

The 2012 results of [1] were reexamined in 2016 in [3] with an emphasis on examining how vendors and end-users responded to the vulnerabilities. The authors examine 81 million distinct RSA keys and were able to factor 313,000 keys (.37%). They also examine trends in the number of weak keys from various companies. Since 2012, a significant number of new devices from companies including Huawei, D-Link, and ADTRAN were found to have vulnerable RSA keys.

Additionally, it was exceedingly rare for end-users to patch their devices to fix vulnerabilities found during the 2012 study. The authors note the “distressing implications” this has for the IoT era, as the number of devices on a network is rapidly increasing.

Rise of the I o T and Cloud Computing

Despite the large number of keys broken by this attack previously, it is unlikely that a key that has been properly generated with a sufficient amount of entropy could be broken with this technique. However, the requirement of sufficient entropy may not always be met when generating keys. In particular, some IoT devices may not have enough entropy available to generate keys without an external source. Major websites can prove vulnerable to this flaw as well, as the public and private keys used with their SSL/TLS certificates are generated by the website owner or administrator, and not by the CA that signs the certificate to make it publicly trusted. This means that the process the site operators used to generate their keys is opaque to the CA, and in general they cannot analyze the submitted public key for poor generation practices.

As growth of the Internet and the IoT has continued, network sizes have grown significantly and Internet-connected devices are appearing in new places. Security is a major concern in this landscape because of the increasing amount of data handled by these devices and the introduction of connected devices in sensitive settings such as operating rooms and vehicles. Compromising any device on these critical networks could results in catastrophic failure if proper precautions are not taken. Similarly, consumers are increasingly relying on publicly available web services to handle sensitive data and perform high-impact operations, such as financial transactions, transmission and storage of valuable intellectual property, and applications for credit or other services. The potential for an attacker to obtain this sensitive information or perform fraudulent operations can significantly impact consumers.

The rise of IoT technologies and cloud-computing resources since the initial publication of this technique makes it more dangerous in today’s landscape for several reasons:

THE IoT IS COMPROMISED MOSTLY OF LIGHTWEIGHT DEVICES.

There is nothing inherently insecure about how such a device would generate an RSA key, but [1] found that lightweight devices are primarily at risk of this attack due to their low entropy states. Entropy in a device is required to prevent the random number generation from being predictable. Researchers were able to find deterministic “random” output when removing entropy. Lightweight IoT devices are particularly prone to being in low entropy states due to the lack of input data they might receive, as well as the challenge of incorporating hardware-based random number generation economically. Keys generated by lightweight IoT devices are therefore at risk of not being sufficiently random, increasing the chance that two keys share a factor and allow the key to be broken. The authors of [1] found that most of the keys that broken were from “low-resource” devices. Only two keys on two certificates were publicly trusted, and both of these certificates had expired.

THE ATTACK BECOMES MORE SUCCESSFUL WHEN MORE PRIME FACTORS ARE ANALYZED.

As the number of discovered certificates grows, the number of certificate pairs to analyze for shared factors increases quadratically. The IoT landscape has drastically increased the number of devices on networks and this trend will only continue. Estimates and projections vary considerably, but Gartner predicts 20 billion devices will be deployed on the Internet within the next year [4].

DEVICES ARE IN NEW AND MORE CRITICAL ENVIRONMENTS.

Compromising an RSA key has much more potential to be catastrophic in 2019. An RSA key being compromised now means more than personal or enterprise data being compromised. Critical real-time environments such as operating rooms, automobiles, industrial control devices, and home security systems now operate using RSA keys. Physical property and lives are therefore now at risk with RSA keys being compromised.

IoT DEVICES ARE MORE DIFFICULT TO PATCH.

Patching many IoT devices across a network can be tedious for a user. There can be many independent devices for a user to consider and manage, often without centralized automation across the devices. In some cases – where original device design did not support patching, or where the device is inaccessible – patching may be impossible. Additionally, due to the long life span of some IoT devices, it is possible that vulnerabilities can be found on live devices that are no longer being actively supported by the manufacturer. This means that a vulnerability to key compromise can be exploitable for much longer, and more difficult to remediate.

COMPUTING AND SCANNING RESOURCES ARE MORE READILY ACCESSIBLE.

Cloud services are readily available for rent to allow an adversary to both acquire and analyze the data efficiently. The only obstacle that an adversary truly faces in this attack is the implementation details. The attack itself has been well studied and understood, and the resources needed to execute it are now readily available at modest cost as well. To illustrate, development and computation resources spent for this study, excluding data acquisition, totaled less than $3000. Furthermore, pre-collected datasets of certificates and keys are available for purchase or even for free download, which can reduce the time and cost for an attacker to perform this computation.

For the above reasons, we feel it is necessary to reevaluate the implications of this attack in a modern landscape. As in previous works, the goal of this study is to provide an awareness of this attack and improve Internet security overall. Given the nature of the research, it is necessary to use real-world data for our analysis. However, as the results produce compromised keys used to secure real-world communications, we do not provide details on broken keys beyond high-level statistical summaries and we do not retain the broken keys past their use for this project.