Update: I’ve updated the section on Cryptographic Doom at the end of the article after clarifications from the age author. That specific criticism was based on my misreading of the age spec.

Age is a new tool for encrypting files, intended to be a more modern successor to PGP/GPG for file encryption. This is a welcome development, as PGP has definitely been showing its age recently. On the face of it, age looks like a good replacement using modern algorithms. But I have a few concerns about its design.

Authenticated encryption

One of the innovations of age is that it aims to support an streaming authenticated encryption. The spec (such as it is) links to Adam Langley’s blog post about streaming encryption, which mentions use-cases like the following:

gpg -d your_archive.tgz.gpg | tar xz

The comments on Hacker News also mentioned cases where people pipe a decrypted file into a shell. Given that age links to this blog post and has implemented a secure streaming AEAD mode, it seems reasonable to suppose that it is intended to be secure in these kinds of use-cases. Each chunk (segment) of ciphertext is authenticated before being decrypted and output, so tar or the shell never sees unauthenticated plaintext and so an attacker can’t tamper with the ciphertext to influence the data being fed into the downstream process. The worst the attacker can do is to truncate the output by corrupting one or more segments causing the decryption to abort halfway (this might still allow significant mischief).

What is authenticated encryption?

Age supports a small number of algorithms. You can encrypt with a password using scrypt. In this case a symmetric key is derived from the password and you get symmetric authenticated encryption as the security goal. This doesn’t just mean that the ciphertext is protected from tampering, it also means that the encrypted file must have come from somebody who knows the password. Assuming you chose a strong password and only shared it with people you trust, then successful authenticated decryption provides strong evidence that the file came from a trusted source.

In terms of threat models you could say that authenticated encryption is intended to protect against spoofing threats as well as tampering threats. (The S and T in STRIDE at a very basic level). Unfortunately, the age spec doesn’t document its threat model or the security goals it is intended to achieve so I’m having to read between the lines to work out what was intended.

For public key cryptography, the notion of authenticated encryption becomes more complicated. I wrote a three-part blog post about it. There are public key authenticated encryption modes, such as NaCl’s box, but age doesn’t use one of those and instead opts for unauthenticated ECIES encryption (like JOSE’s ECDH-ES algorithm) using X25519. So while age uses symmetric authenticated encryption for the file contents, the symmetric file key is itself encrypted using an unauthenticated mode. This means an attacker cannot tamper with the encrypted ciphertext, but they can completely replace it with one of their own choosing. What this means is that age is secure against chosen ciphertext attacks from a confidentiality point of view, but it doesn’t actually provide any origin authentication. You can’t establish that the encrypted file came from a trusted party, so it is completely insecure to pipe the output of age into another tool without independently establishing the authenticity of the file.

The age spec explicitly lists signing and support for web of trust as out of scope. Instead it suggests using separate tools such as signify/minisign if you want to be sure of where a file came from. After all, this is the Unix philosophy: small composable tools that do just one thing. But if you click through the NaCl link above, djb links to a definition of the public key authenticated encryption security goal that he uses for box. This paper (which is old enough to drink in the UK) makes it clear that achieving authenticated encryption by generic composition of signatures and public key encryption is surprisingly difficult; more so than in the symmetric setting as there is no case like Encrypt-then-MAC which is generally always secure. For symmetric crypto we long ago gave up pretending that generic composition was something end users can get right on their own and opted for dedicated AE modes. For some reason, djb seems to be the only person to realise the same applies to public key cryptography.

A second problem with requiring a generic composition of signing and encryption is that it totally kills the streaming use-cases. Either I have to verify a signature over the entire encrypted file before decrypting it, or I have to decrypt to a temporary file that I then verify, or I need to define my own chunked streaming signature verification tool and combine that with age (and hope the chunk sizes line up). Users won’t get this right, so we’ll be right back at streaming unverified plaintext into shell commands.

So how could age fix this? Most importantly, the spec should define its security goals. I believe the correct security goal is authenticated encryption for both symmetric and public key use cases. The simplest way to achieve this would be to swap out the ephemeral X25519 code in favour of encrypting the file key with NaCl’s crypto_box. The sender must supply their own private key during encryption (which could be read out of ~/.age automatically). The recipient either supplies the expected sender’s public key on the command line, or else has a file of trusted senders in their age config directory – perhaps populated by TOFU if you don’t want to get into something like web of trust, but this problem is going to have to be solved somehow.

Unfortunately, as I pointed out on HN, you can’t simply use a static key pair with age to achieve authenticated encryption as age’s key-wrapping algorithm is completely insecure when used in this way (it uses a fixed nonce but isn’t nonce reuse misuse resistant). My overall impression of age is that it uses good algorithms in non-standard ways and then justifies this with ad-hoc reasoning about why it’s safe in this specific implementation. I’d be much happier if it used existing mechanisms from libsodium, which appear to be sufficient to cover all its use-cases.

Cryptographic Doom

The age spec defines a header that lists ways to derive the file decryption key for each recipient. For example, here are some examples from the spec:

-> X25519 8hWaIUmk67IuRZ41zMk2V9f/w3f5qUnXLL7MGPA+zE8tXgpAxKgqyu1jl9I/ATwFgV42ZbNgeAlvCTJ0WgvfEo -> scrypt GixTkc7+InSPLzPNGU6cFw 18 kC4zjzi7LRutdBfOlGHCgox8SXgfYxRYhWM1qPs0ca8 -> ssh-rsa SkdmSg SW+xNSybDWTCkWx20FnCcxlfGC889s2hRxT8+giPH2DQMMFV6DyZpveqXtNwI3ts 5rVkW/7hCBSqEPQwabC6O5ls75uNjeSURwHAaIwtQ6riL9arjVpHMl8O7GWSRnx3 NltQt08ZpBAUkBqq5JKAr20t46ZinEIsD1LsDa2EnJrn0t8Truo2beGwZGkwkE2Y j8mC2GaqR0gUcpGwIk6QZMxOdxNSOO7jhIC32nt1w2Ep1ftk9wV1sFyQo+YYrzOx yCDdUwQAu9oM3Ez6AWkmFyG6AvKIny8I4xgJcBt1DEYZcD5PIAt51nRJQcs2/ANP +Y1rKeTsskMHnlRpOnMlXqoeN6A3xS+EWxFTyg1GREQeaVztuhaL6DVBB22sLskw XBHq/XlkLWkqoLrQtNOPvLoDO80TKUORVsP1y7OyUPHqUumxj9Mn/QtsZjNCPyKN ds7P2OLD/Jxq1o1ckzG3uzv8Vb6sqYUPmRvlXyD7/s/FURA1GetBiQEdRM34xbrB

In each case we have an algorithm identifier followed by algorithm-specific parameters. For example, in the X25519 case we have an ephemeral public key and then an encrypted file key.

But apart from syntax this header is incredibly close to JOSE! You might as well write it as follows:

{ “alg”: “X25519”, “epk”: { ... } }

JOSE even supports multiple recipients (in the lesser used JSON Serialization format) and ECIES with key wrapping. But the cryptographic community has rightly beaten up JOSE for requiring this algorithm header and it led to catastrophic attacks.

Edit: Filippo Valsorda (the author of age) has pointed out that age only uses the algorithm identifier (key type) to match the recipient’s key, not to determine which algorithm to use. Age keys are uniquely linked to an algorithm in exactly the manner I suggest.

I’m fairly sure that age is not vulnerable to the same kind of attacks, but I’m not convinced it never will be. Even if there is no immediate attack, it still violates Moxie’s Cryptographic Doom Principle. Although the age header is protected by a MAC, it cannot verify that MAC until it decrypts the file key. In order to decrypt the file key it trusts the header to tell it what algorithm to use.

Why does this mistake keep being made? As I’ve written before, there are better ways to handle this that systematically avoid these issues.

In summary I think age is interesting and solving a genuine problem. But I think the design could still be improved from where it is today to provide clearer security goals and avoid potential pitfalls in the future.