Cloudflare, a service that helps optimize the security and performance of more than 5.5 million websites, warned customers today that a recently fixed software bug exposed a range of sensitive information that could have included passwords and cookies and tokens used to authenticate users.

A combination of factors made the bug particularly severe. First, the leakage may have been active since September 22, nearly five months before it was discovered, although the greatest period of impact was from February 13 and February 18. Second, some of the highly sensitive data that was leaked was cached by Google and other search engines. The result was that for the entire time the bug was active, hackers had the ability to access the data in real-time by making Web requests to affected websites and to access some of the leaked data later by crafting queries on search engines.

"The bug was serious because the leaked memory could contain private information and because it had been cached by search engines," Cloudflare CTO John Graham-Cumming wrote in a blog post published Thursday. "We are disclosing this problem now as we are satisfied that search engine caches have now been cleared of sensitive information. We have also not discovered any evidence of malicious exploits of the bug or other reports of its existence."

The leakage was the result of a bug in an HTML parser chain Cloudflare uses to modify webpages as they pass through the service's edge servers. The parser performs a variety of tasks, such as inserting Google Analytics tags, converting HTTP links to the more secure HTTPS variety, obfuscating e-mail addresses, and excluding parts of a page from malicious Web bots. When the parser was used in combination with three Cloudflare features—e-mail obfuscation, server-side excludes, and Automatic HTTPS Rewrites—it caused Cloudflare edge servers to leak pseudo random memory contents into certain HTTP responses.

Quick response

Within an hour of the bug coming to Cloudflare's attention early last Saturday morning, engineers had already disabled e-mail obfuscation, a measure that mostly plugged the memory leak. It took another six hours for Cloudflare to identify and fix the underlying bug in the HTML parser.

The leak is vaguely reminiscent of the Heartbleed vulnerability that exposed passwords, secret encryption keys, and other sensitive memory contents residing in servers running a vulnerable version of the OpenSSL crypto library. Unlike Heartbleed, however, the parser bug could be exploited only opportunistically against certain sites that used Cloudflare. It also didn't expose transport layer security keys. The leak was spotted by Google security researcher Tavis Ormandy while he was working on a "corpus distillation project." He and colleagues then struggled to understand what the data was and what was exposing it.

"It became clear after a while we were looking at chunks of uninitialized memory interspersed with valid data," he wrote in a blog post that was also published Thursday. "The program that this uninitialized data was coming from just happened to have the data I wanted in memory at the time. That solved the mystery, but some of the nearby memory had strings and objects that really seemed like they could be from a reverse proxy operated by Cloudflare - a major [content delivery network] service."

Ormandy continued:

A while later, we figured out how to reproduce the problem. It looked like that if an html page hosted behind cloudflare had a specific combination of unbalanced tags, the proxy would intersperse pages of uninitialized memory into the output (kinda like heartbleed, but cloudflare specific and worse for reasons I'll explain later). My working theory was that this was related to their "ScrapeShield" feature which parses and obfuscates html - but because reverse proxies are shared between customers, it would affect *all* Cloudflare customers. We fetched a few live samples, and we observed encryption keys, cookies, passwords, chunks of POST data and even HTTPS requests for other major cloudflare-hosted sites from other users. Once we understood what we were seeing and the implications, we immediately stopped and contacted cloudflare security. This situation was unusual, PII was actively being downloaded by crawlers and users during normal usage, they just didn't understand what they were seeing. Seconds mattered here, emails to support on a friday evening were not going to cut it. I don't have any cloudflare contacts, so reached out for an urgent contact on twitter, and quickly reached the right people. https://twitter.com/taviso/status/832744397800214528 After I explained the situation, cloudflare quickly reproduced the problem, told me they had convened an incident and had an initial mitigation in place within an hour. "You definitely got the right people. We have killed the affected services"

In an update published later, Ormandy took issue with the post Cloudflare published. "It contains an excellent postmortem, but severely downplays the risk to customers," he wrote. In a Twitter message, Ormandy said Cloudflare customers affected by the bug included Uber, 1Password, FitBit, and OKCupid. 1Password said in a blog post that no sensitive data was exposed because it was encrypted in transit.

Graham-Cummings, the Cloudflare CTO, has ruled out the possibility that secret keys for customers' transport layer security certificates were exposed in the leaks. Still, he said end-user passwords, authentication cookies, OAuth tokens used to log into multiple website accounts, and encryption keys Cloudflare uses to protect server-to-server traffic were all at risk of being exposed. Cloudflare customers should at a minimum strongly consider changing passwords. Security researcher Ryan Lackey has other security advice here.

Cloudflare researchers have identified 770 unique URIs that contained leaked memory and were cached by Google, Bing, Yahoo, or other search engines. The 770 unique URIs covered 161 unique domains. Graham-Cummings said Thursday's disclosure came only after the leaked data was fully purged with the help of the search engines. Google cache, however, appeared to show data remained exposed by the bug, as evidenced by links such as this one and social media threads including this one.