LibreSSL: More Than 30 Days Later

tedu@openbsd.org

LibreSSL was officially announced to the world just about exactly five months ago. Bob spoke at BSDCan about the first 30 days. For those who weren't there, I'll quickly rehash some of that material. Also, it's always best to start at the beginning, but then I'll try to focus on some new material and updates.

openssl

Cryptography in general and TLS in particular are pretty difficult to get right. So a monoculture isn't strictly a bad thing. Put all your eggs in one basket, and then watch that basket very carefully, right? Unfortunately, the OpenSSL basket was being watched somewhat less than very carefully. Yeah, it has bugs, but surely somebody else will fix them. And worst case scenario, since everybody uses the same library, everybody will be affected by the bugs. Nobody wants to be alone.

heartbleed

What was unusual about Heartbleed? It's a vulnerability, not an exploit, with a name (and a website!). Previously we'd seen The Internet Worm (when there was only one), then Code Red, Blaster, Stuxnet. They all used various exploits, but the vulnerabilities didn't have names. Heartbleed can't even be considered the worst OpenSSL vuln. Previous bugs have resulted in remote code execution. Anybody remember the Slapper worm? That worm exploited an OpenSSL bug (which was apparently titled the SSLv2 client master key buffer overflow) which revealed not only encrypted data or your private key, but also gave up a remote shell on the server, and then it propogated itself. Yeah, I'd say that's worse. But no headlines.

I mention this just to reinforce that LibreSSL is not the result of "the worst bug ever". I may call you dirty names, but I'm not going to fork your project on the basis of a missing bounds check.

why fork

But why fork? Why not start from scratch? Why not start with some other contender? We did look around a bit, but sadly the state of affairs is that the other contenders aren't so great themselves. Not long before Heartbleed, you may recall Apple dealing with goto fail, aka the worst bug ever, but actually about par for the course.

libressl

posix

#ifdef FIONBIO /* working code */ #else /* crappy workaround */ #endif

In theory, this looks like we're going to use the good code on a posix system. But there's just one problem. The code is testing for FIONBIO, which is only defined if you include the right header. If you forget to include the right header, the compiler falls back to the workaround. The mere existence of workarounds means they can be picked up accidentally, and you'll never know.

socklen_t

Here's a problem. You want to create a variable the same size as socklen_t. One fairly obvious solution would be to declare a variable of type socklen_t. That's not how OpenSSL does things, however. Instead, let's create a union of a couple different ints, call accept(), then inspect the different union members to determine which ones were overwritten by the kernel. Oh, and don't forget to check for big endian versus little endian.

optionmania

#define OPENSSL_NO_HEARTBEATS #define OPENSSL_NO_BUF_FREELISTS

First, the naming convention alone reveals the on by default mentality. Everything is on, you have to pick and choose what to disable. Second, this makes testing for such options problematic. Old versions without the feature don't have the define to disable the feature. Even more bizarrely, this means that future releases of LibreSSL will have to track these defines and continue adding more of them for every new feature we decide not to import. We plan on not adding support for many more features to come.

apply the brakes

I'm going to pick on the FreeBSD guys here for a bit, but it's not their fault.

2014-08-06 OpenSSL advisory

2014-09-09 FreeBSD advisory

What were you guys doing? Oh, that's right. You were probably wading through the 13,000 lines of diff that OpenSSL decided to drop as part of the new release.

Projects need to consider how downstream users will actually deal with their copiuous volume of security patches. This goes back to how did heartbeats get into the ecosystem? Because nobody could keep track of what's going on.

That's not to say LibreSSL development is frozen. We've added support for a few new ciphers, notably chacha20. As we do so, though, we consider what new possible failure modes we may introduce. I wouldn't put a buffer overflow outside the realm of possibility when implementing a cipher, but it's pretty unusual. The inputs to a cipher are quite well defined with little room for error.

SRP

May 5 - Remove libssl SRP

Jul 2 - CVE-2014-5139 (ssl) discovered

Jul 28 - Remove libcrypto SRP

Jul 31 - CVE-2014-5139 (crypto) discovered

Aug 6 - OpenSSL 1.0.1i

Aug 8 - Bug report SRP is broken



On May 5, I removed all the SRP code from libssl, along with kerberos and some other protocol extensions. The problem is that this code is integrated by sprinkling a dozen ifdefs and crazy nested if/else chains into functions that do some pretty critical things, like decide how to exchange keys between client and server. Auditing these 1000 line monoliths was clearly impossible, so we cut down to the basics. The SRP code in libcrypto was left alone because it wasn't directly in the way.

On July 2, OpenSSL received notification about a crash causing bug in the TLS SRP code, but this was not publicly known. On July 28, I deleted the libcrypto SRP code as well. In my commit message, I mentioned "hey there's a bug in here, but the details are secret." This was actually kind of misleading because the secret bug was in a different library, in code that had already been removed.

Three days later, on July 31, two researchers found a remotely expoitable buffer overflow in the libcrypto SRP code. This is like "throw a rock; you're guaranteed to hit something." Even when you point people in the wrong direction, they still find bugs.

On August 6, 1.0.1i was released with fixes for both of the above issues along with about 12800 lines of diff for other stuff. On August 8, a user reported that after upgrading, SRP no longer worked. The fix for the first issue was broken. The bug was embargoed for over a month, and nobody tested the fix.

What's the lesson here? Don't drop jumbo security patches on users. Anybody actually using SRP was in an unfortunate pickle here. They couldn't upgrade to get the fix for the buffer overflow, but instead had to pick it out from the rest. If the patches had been issued separately, as they were discovered, this never would have happened. Nobody's perfect, and I've flubbed a few patches myself, but that's exactly why you don't combine them. If the libssl patch had been released at the beginning of July, the regression would have been discovered and a correct fix hammered out long before the buffer overflow was discovered.

months 2-5

I'm sorry to make it all sound so tame, but avoiding excitement is all part of the plan. The first 30 days were all about revolution. Now we've switched into evolution mode.

portable

I personally do not work on the portable version, but I have kept my eye on it. A few notes on that. First, you'll be happy to know that libressl portable should work on all BSD systems. It works on some other systems, too, but enough about them. Most of the extensioned interfaces in OpenBSD are already shared with other BSDs. You shouldn't even need the portable build system, if you don't like it. The simple BSD makefile build system from OpenBSD will probably suit you better.

That's the good news. The bad news, for now, is that libressl uses the arc4random function to generate random numbers. Theo has a talk coming up all about that, but it's enough for me to say that the arc4random implementations in FreeBSD and NetBSD aren't quite state of the art anymore. I know there are some patches to update FreeBSD at least, but they've stalled out. You're going to want to pick that work up.

We're currently still targeting posix platforms. Windows support isn't out of the question.

ressl

Joel and I have been working on a replacement API for OpenSSL, appropriately entitled ressl. Reimagined SSL is how I think of it. Our goals are consistency and simplicity. In particular, we answer the question "What would the user like to do?" and not "What does the TLS protocol allow the user to do?". You can make a secure connection to a server. You can host a secure server. You can read and write some data over that connection.

A few goals. First, no OpenSSL types or functions are exposed. In fact, not even any ressl internals are exposed. You should never need to contemplate X.509 or ASN.1. Those are implementation details far beyond the level of caring of most developers or users. As a consequence of that, the API is easy for other languges to bind to. The ressl interface could almost equally well describe transport over ssh tunnels. What do you want? Do you want a secure connection? We give you a secure connection.

Perhaps more importantly, it allows the implementation to evolve and change. It's not actually tied to LibreSSL. The libressl library will work with OpenSSL, and can be adapted to use other implementations as well. Previous efforts at replacing OpenSSL usually ended up with compat shims that replicated the OpenSSL API. But that's terrible. If we're going to have a universal API, it needs to be a good one. And it needs to be sufficiently abstract so that others are welcome to the party. Clearly, we've made a claim that LibreSSL is, or at least can be, the best quality TLS stack you can get. But I think the ecosystem benefits by breaking the monoculture.

The ressl API does provide one noteworthy feature. Hostname verification. In order to make a secure TLS connection, you must do two things. Validate the certificate and its trust chain. Then verify that the hostname in the cert matches the hostname you've connected to. Lots of people don't do the latter because OpenSSL doesn't do that latter. You have to do it yourself, which requires knowing about things like CommonNames and SubjectAltNames. The good news is that popular bindings for languages like python and ruby include a function to verify the hostname. The bad news is if you pick a python or ruby project at random, they probably forget to do it. Another funny fact is that since everybody has to write this code themselves, everybody does it a little bit differently. Especially regarding handling of wildcard certificates and everybody's favorite, embedded nul bytes. Hostname verification is on by default in ressl, and the API is designed so that you always provide a hostname; there's no way to accidentally call the function that doesn't do verification.

We have the advantage that we can evolve the client and server APIs in sync with at least two test programs, the OpenBSD ftp client and httpd. We're not nearly ready to call for third party support; that's probably a few months away.

QA

A: The code is in cvs and can be discussed on the tech@ mailing list.