Encrypt-then-MAC

I posted here two weeks ago providing some guidelines on using cryptography ; over the following few days, I was surprised to find that out of the 11 recommendations I made, the most controversial one was my recommendation for securing data by applying AES in CTR mode and then appending an HMAC. Many people suggested that I should instead recommend the use of AES in a block mode which provides both encryption and authentication, such as GCM, CCM, CWC, or EAX, and I explained some of my reasons for eschewing those options on a variety of fora around the internet; but rather than have all of my arguments scattered around the internet, I think it makes sense to provide a more detailed explanation of my reasoning for recommending AES-CTR + HMAC here.

First things first: Why use a composition of encryption and MAC instead of a single primitive which achieves both? Because people are very good at writing bad code. In my experience on the FreeBSD security team, writing dozens of security advisories, there are three types of code which are especially prone to have bugs: New code, complicated code, and rarely-used code. Authenticated encryption schemes fall into all three of these categories, while separate encryption and authentication functions fall into none. It's far easier to make a mistake in implementing AES-CWC than in implementing AES-CTR or HMAC-SHA256; much less likely that testing will uncover such mistakes; and there has been far less time for testing to even have a chance. Of course, most people will not be implementing these cryptographic routines themselves -- most people will use a pre-existing library like OpenSSL -- but this does not invalidate the point: Open source libraries have bugs, too.

Having decided to use a composition of separate encryption and authentication primitives, there remain several options. The three most widely used are

Encrypt-and-MAC: The ciphertext is generated by encrypting the plaintext and then appending a MAC of the plaintext. This is approximately how SSH works. MAC-then-encrypt: The ciphertext is generated by appending a MAC to the plaintext and then encrypting everything. This is approximately how SSL works. Encrypt-then-MAC: The ciphertext is generated by encrypting the plaintext and then appending a MAC of the encrypted plaintext. This is approximately how IPSEC works.

Of these three, only Encrypt-then-MAC is provably secure, in the sense of guaranteeing INT-CTXT (integrity of ciphertexts -- it's unfeasible for an attacker to construct a valid ciphertext other than those which he convinces the key holder to generate) and IND-CCA2 (indistinguishability under adaptive chosen ciphertext attack -- given a ciphertext, it's unfeasible for an attacker to figure out which of two plaintexts it corresponds to, even if he can convince the key holder to decrypt arbitrary messages other than the challenge ciphertext) given secure Encrypt and MAC functions (IND-CPA and strongly unforgeable, respectively). Now, the fact that Encrypt-and-MAC and MAC-then-encrypt aren't provably secure doesn't mean that they are automatically insecure -- it depends on how you use them and the characteristics of the underlying Encrypt and MAC functions -- but there have been a number of vulnerabilities in the past relating to these constructions, whereas Encrypt-then-MAC is essentially impossible to get cryptologically wrong.

Encrypt-then-MAC has another even more important benefit: When decoding data, you can verify the MAC and discard inauthentic packets without ever decrypting anything. This is useful for two reasons: First, it makes a denial of service attack much harder, since it allows you to discard forged packets faster; and second, it reduces your "attack surface". One of the most important rules of computer security is that every line of code is a potential security flaw; if you can make sure that an attacker who doesn't have access to your MAC key can't ever feed evil input to a block of code, however, you dramatically reduce the chance that he will be able to exploit any bugs. (At one point in my Tarsnap online backup system, I include an "unnecessary" HMAC in order to prevent an attacker from feeding evil input to a data decompression library -- the decompressed data gets verified later anyway, but adding an extra HMAC which is verified before decompression helps to protect against any vulnerabilities in the decompression libary.)

This also ties in to why I recommend using CTR mode for encryption, and another reason to avoid using primitives like EAX which combine encryption and authentication: In CTR mode, you avoid passing attacker-provided data to the block cipher (with the possible exception of the nonce which forms part of each block). This reduces the attack surface even further: Using CTR mode, an attacker cannot execute a chosen-ciphertext (or chosen-plaintext) attack against your block cipher, even if (in the case of Encrypt-then-MAC) he can correctly forge the MAC (say, if he stole the MAC key but doesn't have the Encrypt key). Is this an attack scenario worth considering? Absolutely -- the side channel attacks published by Bernstein (exploiting a combination of cache and microarchitectural characteristics) and Osvik, Shamir, and Tromer (exploiting cache collisions) rely on gaining statistical data based on a large number of random tests, and it appears unlikely that such attacks would be feasible in a context where an attacker could not provide a chosen input.

Are there situations where I would not use CTR-then-MAC? Of course; there are always exceptional situations. On tiny hardware systems, it may be preferable to use EAX mode, since that can be implemented with only a block cipher circuit rather than requiring a block cipher circuit AND a hash circuit; similarly, in applications requiring extremely high performance hardware, it may be preferable to use CWC mode, since it allows both encryption and authentication to be parallelized very effectively; and if software performance is absolutely critical, it may be preferable to use OCB, since it is a "one-pass" authenticated encryption primitive and is thus faster than composing separate encryption and authentication primitives. However, for software running on general-purpose computers and sending data over the Internet -- which I imagine covers the vast majority of what my readers are doing with cryptography -- there is no need for these specialized constructions.

I'll conclude with a philosophical note about software design: Assessing the security of software via the question "can we find any security flaws in it?" is like assessing the structure of a bridge by asking the question "has it collapsed yet?" -- it is the most important question, to be certain, but it also profoundly misses the point. Engineers design bridges with built-in safety margins in order to guard against unforeseen circumstances (unexpectedly high winds, corrosion causing joints to weaken, a traffic accident severing support cables, et cetera); secure software should likewise be designed to tolerate failures within individual components. Using a MAC to make sure that an attacker cannot exploit a bug (or a side channel) in encryption code is an example of this approach: If everything works as designed, this adds nothing to the security of the system; but in the real world where components fail, it can mean the difference between being compromised or not. The concept of "security in depth" is not new to network administators; but it's time for software engineers to start applying the same engineering principles within individual applications as well.

Disqus