by Tobias Goeschel

Everyone’s shaking their heads

First of all, I assume that by now, everyone who has ever read a single tweet in his/her life has heard about Apple’s instantly infamous “gotofail” bug by now, and most of you have probably already read Imperial Violet’s analysis of it.

To sum up the debacle in short: A duplicate line of code, goto fail; , causes a critical SSL certificate verification algorithm to jump out of a series of validations at an unexpected time, causing a success value to be returned, and thus rendering the service vulnerable to attacks.

Bad. To say the least.

Now it seems that people unanimously agree in blaming missing curly braces around the if statement in this piece of code

if ( ( err = SSLHashSHA1. update ( & hashCtx , & signedParams ) ) != 0 ) goto fail ; goto fail ; if ((err = SSLHashSHA1.update(&hashCtx, &signedParams)) != 0) goto fail; goto fail;

for the entire mess, and the common conclusion from this fiasco is “always put curly braces around your if statements, and this will never happen to you”.

Or will it? I mean, I find it rather curious that everyone seems to blame the mouse, while there’s a giant elephant in the room…



Now let’s look at that code again

Here’s the entire method, taken from Apple’s open source publication:

static OSStatus SSLVerifySignedServerKeyExchange ( SSLContext * ctx , bool isRsa , SSLBuffer signedParams , uint8_t * signature , UInt16 signatureLen ) { OSStatus err ; SSLBuffer hashOut , hashCtx , clientRandom , serverRandom ; uint8_t hashes [ SSL_SHA1_DIGEST_LEN + SSL_MD5_DIGEST_LEN ] ; SSLBuffer signedHashes ; uint8_t * dataToSign ; size_t dataToSignLen ; signedHashes. data = 0 ; hashCtx. data = 0 ; clientRandom. data = ctx -> clientRandom ; clientRandom. length = SSL_CLIENT_SRVR_RAND_SIZE ; serverRandom. data = ctx -> serverRandom ; serverRandom. length = SSL_CLIENT_SRVR_RAND_SIZE ; if ( isRsa ) { /* skip this if signing with DSA */ dataToSign = hashes ; dataToSignLen = SSL_SHA1_DIGEST_LEN + SSL_MD5_DIGEST_LEN ; hashOut. data = hashes ; hashOut. length = SSL_MD5_DIGEST_LEN ; if ( ( err = ReadyHash ( & SSLHashMD5 , & hashCtx ) ) != 0 ) goto fail ; if ( ( err = SSLHashMD5. update ( & hashCtx , & clientRandom ) ) != 0 ) goto fail ; if ( ( err = SSLHashMD5. update ( & hashCtx , & serverRandom ) ) != 0 ) goto fail ; if ( ( err = SSLHashMD5. update ( & hashCtx , & signedParams ) ) != 0 ) goto fail ; if ( ( err = SSLHashMD5. final ( & hashCtx , & hashOut ) ) != 0 ) goto fail ; } else { /* DSA, ECDSA - just use the SHA1 hash */ dataToSign = & hashes [ SSL_MD5_DIGEST_LEN ] ; dataToSignLen = SSL_SHA1_DIGEST_LEN ; } hashOut. data = hashes + SSL_MD5_DIGEST_LEN ; hashOut. length = SSL_SHA1_DIGEST_LEN ; if ( ( err = SSLFreeBuffer ( & hashCtx ) ) != 0 ) goto fail ; if ( ( err = ReadyHash ( & SSLHashSHA1 , & hashCtx ) ) != 0 ) goto fail ; if ( ( err = SSLHashSHA1. update ( & hashCtx , & clientRandom ) ) != 0 ) goto fail ; if ( ( err = SSLHashSHA1. update ( & hashCtx , & serverRandom ) ) != 0 ) goto fail ; if ( ( err = SSLHashSHA1. update ( & hashCtx , & signedParams ) ) != 0 ) goto fail ; goto fail ; if ( ( err = SSLHashSHA1. final ( & hashCtx , & hashOut ) ) != 0 ) goto fail ; err = sslRawVerify ( ctx , ctx -> peerPubKey , dataToSign , /* plaintext */ dataToSignLen , /* plaintext length */ signature , signatureLen ) ; if ( err ) { sslErrorLog ( "SSLDecodeSignedServerKeyExchange: sslRawVerify " "returned %d

" , ( int ) err ) ; goto fail ; } fail : SSLFreeBuffer ( & signedHashes ) ; SSLFreeBuffer ( & hashCtx ) ; return err ; } static OSStatus SSLVerifySignedServerKeyExchange(SSLContext *ctx, bool isRsa, SSLBuffer signedParams, uint8_t *signature, UInt16 signatureLen) { OSStatus err; SSLBuffer hashOut, hashCtx, clientRandom, serverRandom; uint8_t hashes[SSL_SHA1_DIGEST_LEN + SSL_MD5_DIGEST_LEN]; SSLBuffer signedHashes; uint8_t *dataToSign; size_t dataToSignLen; signedHashes.data = 0; hashCtx.data = 0; clientRandom.data = ctx->clientRandom; clientRandom.length = SSL_CLIENT_SRVR_RAND_SIZE; serverRandom.data = ctx->serverRandom; serverRandom.length = SSL_CLIENT_SRVR_RAND_SIZE; if(isRsa) { /* skip this if signing with DSA */ dataToSign = hashes; dataToSignLen = SSL_SHA1_DIGEST_LEN + SSL_MD5_DIGEST_LEN; hashOut.data = hashes; hashOut.length = SSL_MD5_DIGEST_LEN; if ((err = ReadyHash(&SSLHashMD5, &hashCtx)) != 0) goto fail; if ((err = SSLHashMD5.update(&hashCtx, &clientRandom)) != 0) goto fail; if ((err = SSLHashMD5.update(&hashCtx, &serverRandom)) != 0) goto fail; if ((err = SSLHashMD5.update(&hashCtx, &signedParams)) != 0) goto fail; if ((err = SSLHashMD5.final(&hashCtx, &hashOut)) != 0) goto fail; } else { /* DSA, ECDSA - just use the SHA1 hash */ dataToSign = &hashes[SSL_MD5_DIGEST_LEN]; dataToSignLen = SSL_SHA1_DIGEST_LEN; } hashOut.data = hashes + SSL_MD5_DIGEST_LEN; hashOut.length = SSL_SHA1_DIGEST_LEN; if ((err = SSLFreeBuffer(&hashCtx)) != 0) goto fail; if ((err = ReadyHash(&SSLHashSHA1, &hashCtx)) != 0) goto fail; if ((err = SSLHashSHA1.update(&hashCtx, &clientRandom)) != 0) goto fail; if ((err = SSLHashSHA1.update(&hashCtx, &serverRandom)) != 0) goto fail; if ((err = SSLHashSHA1.update(&hashCtx, &signedParams)) != 0) goto fail; goto fail; if ((err = SSLHashSHA1.final(&hashCtx, &hashOut)) != 0) goto fail; err = sslRawVerify(ctx, ctx->peerPubKey, dataToSign, /* plaintext */ dataToSignLen, /* plaintext length */ signature, signatureLen); if(err) { sslErrorLog("SSLDecodeSignedServerKeyExchange: sslRawVerify " "returned %d

", (int)err); goto fail; } fail: SSLFreeBuffer(&signedHashes); SSLFreeBuffer(&hashCtx); return err; }

I find it amusing that the first thing anyone would think when looking at this code is “there should have been curly braces”. To make a point, here’s how that would look:

static OSStatus SSLVerifySignedServerKeyExchange ( SSLContext * ctx , bool isRsa , SSLBuffer signedParams , uint8_t * signature , UInt16 signatureLen ) { OSStatus err ; SSLBuffer hashOut , hashCtx , clientRandom , serverRandom ; uint8_t hashes [ SSL_SHA1_DIGEST_LEN + SSL_MD5_DIGEST_LEN ] ; SSLBuffer signedHashes ; uint8_t * dataToSign ; size_t dataToSignLen ; signedHashes. data = 0 ; hashCtx. data = 0 ; clientRandom. data = ctx -> clientRandom ; clientRandom. length = SSL_CLIENT_SRVR_RAND_SIZE ; serverRandom. data = ctx -> serverRandom ; serverRandom. length = SSL_CLIENT_SRVR_RAND_SIZE ; if ( isRsa ) { /* skip this if signing with DSA */ dataToSign = hashes ; dataToSignLen = SSL_SHA1_DIGEST_LEN + SSL_MD5_DIGEST_LEN ; hashOut. data = hashes ; hashOut. length = SSL_MD5_DIGEST_LEN ; if ( ( err = ReadyHash ( & SSLHashMD5 , & hashCtx ) ) != 0 ) { goto fail ; } if ( ( err = SSLHashMD5. update ( & hashCtx , & clientRandom ) ) != 0 ) { goto fail ; } if ( ( err = SSLHashMD5. update ( & hashCtx , & serverRandom ) ) != 0 ) { goto fail ; } if ( ( err = SSLHashMD5. update ( & hashCtx , & signedParams ) ) != 0 ) { goto fail ; } if ( ( err = SSLHashMD5. final ( & hashCtx , & hashOut ) ) != 0 ) { goto fail ; } } else { /* DSA, ECDSA - just use the SHA1 hash */ dataToSign = & hashes [ SSL_MD5_DIGEST_LEN ] ; dataToSignLen = SSL_SHA1_DIGEST_LEN ; } hashOut. data = hashes + SSL_MD5_DIGEST_LEN ; hashOut. length = SSL_SHA1_DIGEST_LEN ; if ( ( err = SSLFreeBuffer ( & hashCtx ) ) != 0 ) { goto fail ; } if ( ( err = ReadyHash ( & SSLHashSHA1 , & hashCtx ) ) != 0 ) { goto fail ; } if ( ( err = SSLHashSHA1. update ( & hashCtx , & clientRandom ) ) != 0 ) { goto fail ; } if ( ( err = SSLHashSHA1. update ( & hashCtx , & serverRandom ) ) != 0 ) { goto fail ; } if ( ( err = SSLHashSHA1. update ( & hashCtx , & signedParams ) ) != 0 ) { goto fail ; } goto fail ; if ( ( err = SSLHashSHA1. final ( & hashCtx , & hashOut ) ) != 0 ) { goto fail ; } err = sslRawVerify ( ctx , ctx -> peerPubKey , dataToSign , /* plaintext */ dataToSignLen , /* plaintext length */ signature , signatureLen ) ; if ( err ) { sslErrorLog ( "SSLDecodeSignedServerKeyExchange: sslRawVerify " "returned %d

" , ( int ) err ) ; goto fail ; } fail : SSLFreeBuffer ( & signedHashes ) ; SSLFreeBuffer ( & hashCtx ) ; return err ; } static OSStatus SSLVerifySignedServerKeyExchange(SSLContext *ctx, bool isRsa, SSLBuffer signedParams, uint8_t *signature, UInt16 signatureLen) { OSStatus err; SSLBuffer hashOut, hashCtx, clientRandom, serverRandom; uint8_t hashes[SSL_SHA1_DIGEST_LEN + SSL_MD5_DIGEST_LEN]; SSLBuffer signedHashes; uint8_t *dataToSign; size_t dataToSignLen; signedHashes.data = 0; hashCtx.data = 0; clientRandom.data = ctx->clientRandom; clientRandom.length = SSL_CLIENT_SRVR_RAND_SIZE; serverRandom.data = ctx->serverRandom; serverRandom.length = SSL_CLIENT_SRVR_RAND_SIZE; if(isRsa) { /* skip this if signing with DSA */ dataToSign = hashes; dataToSignLen = SSL_SHA1_DIGEST_LEN + SSL_MD5_DIGEST_LEN; hashOut.data = hashes; hashOut.length = SSL_MD5_DIGEST_LEN; if ((err = ReadyHash(&SSLHashMD5, &hashCtx)) != 0) { goto fail; } if ((err = SSLHashMD5.update(&hashCtx, &clientRandom)) != 0) { goto fail; } if ((err = SSLHashMD5.update(&hashCtx, &serverRandom)) != 0) { goto fail; } if ((err = SSLHashMD5.update(&hashCtx, &signedParams)) != 0) { goto fail; } if ((err = SSLHashMD5.final(&hashCtx, &hashOut)) != 0) { goto fail; } } else { /* DSA, ECDSA - just use the SHA1 hash */ dataToSign = &hashes[SSL_MD5_DIGEST_LEN]; dataToSignLen = SSL_SHA1_DIGEST_LEN; } hashOut.data = hashes + SSL_MD5_DIGEST_LEN; hashOut.length = SSL_SHA1_DIGEST_LEN; if ((err = SSLFreeBuffer(&hashCtx)) != 0) { goto fail; } if ((err = ReadyHash(&SSLHashSHA1, &hashCtx)) != 0) { goto fail; } if ((err = SSLHashSHA1.update(&hashCtx, &clientRandom)) != 0) { goto fail; } if ((err = SSLHashSHA1.update(&hashCtx, &serverRandom)) != 0) { goto fail; } if ((err = SSLHashSHA1.update(&hashCtx, &signedParams)) != 0) { goto fail; } goto fail; if ((err = SSLHashSHA1.final(&hashCtx, &hashOut)) != 0) { goto fail; } err = sslRawVerify(ctx, ctx->peerPubKey, dataToSign, /* plaintext */ dataToSignLen, /* plaintext length */ signature, signatureLen); if(err) { sslErrorLog("SSLDecodeSignedServerKeyExchange: sslRawVerify " "returned %d

", (int)err); goto fail; } fail: SSLFreeBuffer(&signedHashes); SSLFreeBuffer(&hashCtx); return err; }

Can you spot the error? I mean: That, obviously, would have made one hell of a difference, wouldn’t it?

What seems to be the problem?

There’s no arguing around the fact that the duplicate line of code is a horrible, horrible bug. But as to how it got there, I beg to differ from the conclusion that everyone else seems to agree on:

Those braces aren’t at fault. A lazy programmer is.

Alright, maybe not entirely. A minor part in this mess can be attributed to the IDE (Xcode, I would assume) not catching the fact that a sizeable portion of the code is unreachable. A modern IDE should really show a warning in such cases, and as Peter Nelson points out, even Xcode seems to have an option for that, though it isn’t on by default – strangely enough, I might add.

But how do we fix it?

Now what can we learn from this? Here’s a number of things we can do to avoid this kind of disaster:

Just try the damn thing and see if it works Duh. I mean: really. Why wouldn’t you? And since the purpose of this code is obviously not to allow a key exchange, but rather to deny it, if anything is not according to protocol, you should be throwing all the stuff that could be forged at it, not just valid values. Write an automated test This is the next obvious step, and like the manual test, it should verify all the possible ways the certificate validation could fail, first of all. Landon Fuller wrote up an example to show it is quite possible to run an integration test against this method. Have someone else review your code Another obvious one. This is a hugely critical piece of code at a very, very exposed position in a fundamental part of an operating system – no way this should have ever seen the light of day without at least a second pair of eyes having a look at it. No. Way. Pair program One step up from code reviews: Two brains are smarter than one. Four eyes see more than two. Your code will instantly get better in every way if you agree to share ownership of it. Even if you overlook something like this when hacking away at your code, your pairing partner most likely won’t. Also, they might have better ideas on how to do things, such as: Conditionals should express what you actually want to check That, to me, is one of the most valuable pieces of advice you can take from Uncle Bob Martin: If-statements should encapsulate code that is executed only when a condition is true –

not jump out of an algorithm or method, if otherwise. In this case, instead of employing if(err != 0) and what looks like ten million goto fail; commands, the broken part of the method should have checked for (err == 0) , and thus looked at least like this: if ( ( err = SSLFreeBuffer ( & hashCtx ) ) == 0 ) if ( ( err = ReadyHash ( & SSLHashSHA1 , & hashCtx ) ) == 0 ) if ( ( err = SSLHashSHA1. update ( & hashCtx , & clientRandom ) ) == 0 ) if ( ( err = SSLHashSHA1. update ( & hashCtx , & serverRandom ) ) == 0 ) if ( ( err = SSLHashSHA1. update ( & hashCtx , & signedParams ) ) == 0 ) err = SSLHashSHA1. final ( & hashCtx , & hashOut ) ; if ( err ) goto fail ; if ((err = SSLFreeBuffer(&hashCtx)) == 0) if ((err = ReadyHash(&SSLHashSHA1, &hashCtx)) == 0) if ((err = SSLHashSHA1.update(&hashCtx, &clientRandom)) == 0) if ((err = SSLHashSHA1.update(&hashCtx, &serverRandom)) == 0) if ((err = SSLHashSHA1.update(&hashCtx, &signedParams)) == 0) err = SSLHashSHA1.final(&hashCtx, &hashOut); if (err) goto fail; which, then, can be simplified even further to if ( ( err = SSLFreeBuffer ( & hashCtx ) ) == 0 && ( err = ReadyHash ( & SSLHashSHA1 , & hashCtx ) ) == 0 && ( err = SSLHashSHA1. update ( & hashCtx , & clientRandom ) ) == 0 && ( err = SSLHashSHA1. update ( & hashCtx , & serverRandom ) ) == 0 && ( err = SSLHashSHA1. update ( & hashCtx , & signedParams ) ) == 0 ) err = SSLHashSHA1. final ( & hashCtx , & hashOut ) ; if ( err ) goto fail ; if ((err = SSLFreeBuffer(&hashCtx)) == 0 && (err = ReadyHash(&SSLHashSHA1, &hashCtx)) == 0 && (err = SSLHashSHA1.update(&hashCtx, &clientRandom)) == 0 && (err = SSLHashSHA1.update(&hashCtx, &serverRandom)) == 0 && (err = SSLHashSHA1.update(&hashCtx, &signedParams)) == 0 ) err = SSLHashSHA1.final(&hashCtx, &hashOut); if (err) goto fail; Notice how this kind of structure shows what we really want to do: Execute a sequence of steps, and proceed to the next step only if the current step did not return an error. If it did, then any subsequent statements won’t execute, and if err is not 0 after the whole block, there’s one goto fail; , which also states the programmer’s original intent more precisely: If anything went wrong, exit the method. Don’t copy and paste code The most blatant thing I noticed when I glanced over the rest of the source file that contains the bug is the amount of duplicate or nearly duplicate code that can be found. Clearly, someone tried to go the easy way and copied/pasted the same code all over the place. I found minor variations of the defective if-goto-sequence if ( ( err = SSLFreeBuffer ( & hashCtx ) ) != 0 ) goto fail ; if ( ( err = ReadyHash ( & SSLHashSHA1 , & hashCtx ) ) != 0 ) goto fail ; if ( ( err = SSLHashSHA1. update ( & hashCtx , & clientRandom ) ) != 0 ) goto fail ; if ( ( err = SSLHashSHA1. update ( & hashCtx , & serverRandom ) ) != 0 ) goto fail ; if ( ( err = SSLHashSHA1. update ( & hashCtx , & signedParams ) ) != 0 ) goto fail ; if ( ( err = SSLHashSHA1. final ( & hashCtx , & hashOut ) ) != 0 ) goto fail ; if ((err = SSLFreeBuffer(&hashCtx)) != 0) goto fail; if ((err = ReadyHash(&SSLHashSHA1, &hashCtx)) != 0) goto fail; if ((err = SSLHashSHA1.update(&hashCtx, &clientRandom)) != 0) goto fail; if ((err = SSLHashSHA1.update(&hashCtx, &serverRandom)) != 0) goto fail; if ((err = SSLHashSHA1.update(&hashCtx, &signedParams)) != 0) goto fail; if ((err = SSLHashSHA1.final(&hashCtx, &hashOut)) != 0) goto fail; in at least 5 places – all virtually identical, and all equally horrid; a definite sign of gratuitous and unreflected use of copy and paste. In fact, you can eliminate the bug, its cause and about a third of the code in the SSLVerifySignedServerKeyExchange() method by extracting this rather uniform sequence of calls to a HashReference into its own method: static OSStatus SSLApplyHash ( const HashReference * hash , SSLBuffer * hashCtx , SSLBuffer * clientRandom , SSLBuffer * serverRandom , SSLBuffer * signedParams , SSLBuffer * hashOut ) { OSStatus err ; if ( ( err = SSLFreeBuffer ( hashCtx ) ) == 0 && ( err = ReadyHash ( hash , hashCtx ) ) == 0 && ( err = hash -> update ( hashCtx , clientRandom ) ) == 0 && ( err = hash -> update ( hashCtx , serverRandom ) ) == 0 && ( err = hash -> update ( hashCtx , signedParams ) ) == 0 ) err = hash -> final ( hashCtx , hashOut ) ; return err ; } static OSStatus SSLApplyHash( const HashReference *hash, SSLBuffer *hashCtx, SSLBuffer *clientRandom, SSLBuffer *serverRandom, SSLBuffer *signedParams, SSLBuffer *hashOut) { OSStatus err; if ((err = SSLFreeBuffer(hashCtx)) == 0 && (err = ReadyHash(hash, hashCtx)) == 0 && (err = hash->update(hashCtx, clientRandom)) == 0 && (err = hash->update(hashCtx, serverRandom)) == 0 && (err = hash->update(hashCtx, signedParams)) == 0 ) err = hash->final(hashCtx, hashOut); return err; } which can then be called from anywhere using a single line, such as: err = SSLApplyHash ( & SSLHashMD5 , & hashCtx , & clientRandom , & serverRandom , & signedParams , & hashOut ) ; err = SSLApplyHash(&SSLHashMD5, &hashCtx, &clientRandom, &serverRandom, &signedParams, &hashOut); I bet that would eliminate at least 50 lines of crappy code from the source file. [Update]

As pg (see comments section) points out, cleaning up this part of the code should have gone even further. Please refer to his very good post to find out just how far. Make your methods do only one thing If you put it in natural language, the SSLVerifySignedServerKeyExchange() method “does a lot of magical things with hashes (no need to get into details), which need to be done in a specific order, but only if all steps in the sequence run without error, and with slight differences depending on which kind of keys are used, and log errors otherwise.”

Clearly, that is quite a lot more than one thing. And that is a very distinct indicator that it should be significantly shorter, and more precise.

To me, it should really be split up into several individual methods: The above-mentioned SSLApplyHash utility method

utility method A utility method to initialize SSLBuffer variables from the passed in context data, to be used somewhat like this:

SSLBuffer clientRandom = SSLRandomBuffer( ctx->clientRandom );

Two more methods to encapsulate the RSA and non-RSA execution paths

Finally, the original API method, which should basically only decide whether to execute the RSA or non-RSA paths, and write the log message in case of an error. Apart from getting rid of a lot of code duplication (not just within this method, but in many other places within the file, as well), this would make the algorithm a lot more readable and thus overlooking errors far less likely. Use readable, descriptive variable names Try to explain the meaning of these without reading the context of the surrounding code:

hashes,

hashOut,

hashCtx,

ctx,

clientRandom,

serverRandom,

signedParams,

dataToSign . Wouldn’t it have been a lot more understandable to call them, say,

appliedHashResults,

hashOutputBuffer,

tmpHashContext,

sslContext,

randomClientValueForSharedSecret,

randomServerValueForSharedSecret,

messageToSign ? And these were only the first quick ideas I came up with while writing this paragraph, not even a well-reflected choice of names that originated from hours of work on the code that contains these variables…

Conclusion / tl;dr

To make it clear once again: This should never have happened.

It shouldn’t have happened, because this bug is so obvious, that any reasonable amount of scrutiny, which should always be applied to critical pieces of code, will catch it.

It shouldn’t have happened, because even if the humans missed it, the tooling should have complained.

It shouldn’t have happened, because a simple test would have shown that the code never actually did what it was intended to do.

But first and foremost, it shouldn’t have happened, because making the algorithm terse and readable would have forced the programmer to think about flow, structure and intent of his/her code more thoroughly, and at that point, the superfluous goto fail; would have stuck out like a sore thumb.

There’s only one thing to blame here, and that has nothing to do with code style (as which I would consider where the brackets and braces go, whether to add empty lines, etc. etc.), but rather with craftsmanship and professional attitude.

And I have to say it: It’s not worthy of a company like Apple, which prides itself to sell only the highest quality products, to produce such sloppy, unchecked, untested and obviously uncared for code, least of all in a critical part of the foundations of the operating system that runs on all of its devices.