Abstract: Heartbleed wasn’t fun. It represents us moving from “attacks could happen” to “attacks have happened”, and that’s not necessarily a good thing. The larger takeaway actually isn’t “This wouldn’t have happened if we didn’t add Ping”, the takeaway is “We can’t even add Ping, how the heck are we going to fix everything else?”. The answer is that we need to take Matthew Green’s advice, start getting serious about figuring out what software has become Critical Infrastructure to the global economy, and dedicating genuine resources to supporting that code. It took three years to find Heartbleed. We have to move towards a model of No More Accidental Finds.

======================

You know, I’d hoped I’d be able to avoid a long form writeup on Heartbleed. Such things are not meant to be. I’m going to leave many of the gory technical details to others, but there’s a few angles that haven’t been addressed and really need to be. So, let’s talk. What to make of all this noise?

First off, there’s been a subtle shift in the risk calculus around security vulnerabilities. Before, we used to say: “A flaw has been discovered. Fix it, before it’s too late.” In the case of Heartbleed, the presumption is that it’s already too late, that all information that could be extracted, has been extracted, and that pretty much everyone needs to execute emergency remediation procedures.

It’s a significant change, to assume the worst has already occurred.

It always seems like a good idea in security to emphasize prudence over accuracy, possible risk over evidence of actual attack. And frankly this policy has been run by the privacy community for some time now. Is this a positive shift? It certainly allows an answer to the question for your average consumer, “What am I supposed to do in response to this Internet ending bug?” “Well, presume all your passwords leaked and change them!”

I worry, and not merely because “You can’t be too careful” has not at all been an entirely pleasant policy in the real world. We have lots of bugs in software. Shall we presume every browser flaw not only needs to be patched, but has already been exploited globally worldwide, and you should wipe your machine any time one is discovered? This OpenSSL flaw is pernicious, sure. We’ve had big flaws before, ones that didn’t just provide read access to remote memory either. Why the freak out here?

Because we expected better, here, of all places.

There’s been quite a bit of talk, about how we never should have been exposed to Heartbleed at all, because TLS heartbeats aren’t all that important a feature anyway. Yes, it’s 2014, and the security community is complaining about Ping again. This is of course pretty rich, given that it seems half of us just spent the last few days pinging the entire Internet to see who’s still exposed to this particular flaw. We in security sort of have blinders on, in that if the feature isn’t immediately and obviously useful to us, we don’t see the point.

In general, you don’t want to see a protocol designed by the security community. It won’t do much. In return (with the notable and very appreciated exception of Dan Bernstein), the security community doesn’t want to design you a protocol. It’s pretty miserable work. Thanks to what I’ll charitably describe as “unbound creativity” the fast and dumb and unmodifying design of the Internet has made way to a hodge podge of proxies and routers and “smart” middleboxes that do who knows what. Protocol design is miserable, nothing is elegant. Anyone who’s spent a day or two trying to make P2P VoIP work on the modern Internet discovers very quickly why Skype was worth billions. It worked, no matter what.

Anyway, in an alternate universe TLS heartbeats (with full ping functionality) are a beloved security feature of the protocol as they’re the key to constant time, constant bandwidth tunneling of data over TLS without horrifying application layer hacks. As is, they’re tremendously useful for keeping sessions alive, a thing I’d expect hackers with even a mild amount of experience with remote shells to appreciate. The Internet is moving to long lived sessions, as all Gmail users can attest to. KeepAlives keep long lived things working. SSH has been supporting protocol layer KeepAlives forever, as can be seen:

The takeaway here is not “If only we hadn’t added ping, this wouldn’t have happened.” The true lesson is, “If only we hadn’t added anything at all, this wouldn’t have happened.” In other words, if we can’t even securely implement Ping, how could we ever demand “important” changes? Those changes tend to be much more fiddly, much more complicated, much riskier. But if we can’t even securely add this line of code:

if (1 + 2 + payload + 16 > s->s3->rrec.length)

I know Neel Mehta. I really like Neel Mehta. It shouldn’t take absolute heroism, one of the smartest guys in our community, and three years for somebody to notice a flaw when there’s a straight up length field in the patch. And that, I think, is a major and unspoken component of the panic around Heartbleed. The OpenSSL dev shouldn’t have written this (on New Years Eve, at 1AM apparently). His coauthors and release engineers shouldn’t have let it through. The distros should have noticed. Somebody should have been watching the till, at least this one particular till, and it seems nobody was.

Nobody publicly, anyway.

If we’re going to fix the Internet, if we’re going to really change things, we’re going to need the freedom to do a lot more dramatic changes than just Ping over TLS. We have to be able to manage more; we’re failing at less.

There’s a lot of rigamarole around defense in depth, other languages that OpenSSL could be written in, “provable software”, etc. Everyone, myself included, has some toy that would have fixed this. But you know, word from the Wall Street Journal is that there have been all of $841 in donations to the OpenSSL project to address this matter. We are building the most important technologies for the global economy on shockingly underfunded infrastructure. We are truly living through Code in the Age of Cholera.

Professor Matthew Green of Johns Hopkins University recently commented that he’s been running around telling the world for some time that OpenSSL is Critical Infrastructure. He’s right. He really is. The conclusion is resisted strongly, because you cannot imagine the regulatory hassles normally involved with traditionally being deemed Critical Infrastructure. A world where SSL stacks have to be audited and operated against such standards is a world that doesn’t run SSL stacks at all.

And so, finally, we end up with what to learn from Heartbleed. First, we need a new model of Critical Infrastructure protection, one that dedicates real financial resources to the safety and stability of the code our global economy depends on – without attempting to regulate that code to death. And second, we need to actually identify that code.

When I said that we expected better of OpenSSL, it’s not merely that there’s some sense that security-driven code should be of higher quality. (OpenSSL is legendary for being considered a mess, internally.) It’s that the number of systems that depend on it, and then expose that dependency to the outside world, are considerable. This is security’s largest contributed dependency, but it’s not necessarily the software ecosystem’s largest dependency. Many, maybe even more systems depend on web servers like Apache, nginx, and IIS. We fear vulnerabilities significantly more in libz than libbz2 than libxz, because more servers will decompress untrusted gzip over bzip2 over xz. Vulnerabilities are not always in obvious places – people underestimate just how exposed things like libxml and libcurl and libjpeg are. And as HD Moore showed me some time ago, the embedded space is its own universe of pain, with 90’s bugs covering entire countries.

If we accept that a software dependency becomes Critical Infrastructure at some level of economic dependency, the game becomes identifying those dependencies, and delivering direct technical and even financial support. What are the one million most important lines of code that are reachable by attackers, and least covered by defenders? (The browsers, for example, are very reachable by attackers but actually defended pretty zealously – FFMPEG public is not FFMPEG in Chrome.)

Note that not all code, even in the same project, is equally exposed. It’s tempting to say it’s a needle in a haystack. But I promise you this: Anybody patches Linux/net/ipv4/tcp_input.c (which handles inbound network for Linux), a hundred alerts are fired and many of them are not to individuals anyone would call friendly. One guy, one night, patched OpenSSL. Not enough defenders noticed, and it took Neel Mehta to do something.

We fix that, or this happens again. And again. And again.

No more accidental finds. The stakes are just too high.