In my last post I was clutching my pearls over a 13 year old MSDN article that had the gall to be written by somebody not familiar with the proper usage of cryptographic initialization vectors. Some of my fans were nice enough to point out that in all of my histrionic ranting and raving I forgot to say anything useful, like how you’re actually supposed to encrypt a file.

So without further ado, here is how you encrypt and decrypt a file:

$ gpg -c --cipher-algo AES128 [file] # Encrypt

$ gpg -d [file].gpg # Decrypt

Anticlimactic, right? It doesn’t even mention IVs, but that’s kind of the point. A tutorial about file encryption that includes manual handing of IVs is kind of like a tutorial about setting up a web server that includes reimplementing TCP. There’s literally no benefit to doing it and you’re almost guaranteed to do it wrong. It’s the difference between encrypting a file and implementing file encryption. The first is a common task for a programmer and the second shouldn’t be attempted by a non-crypto-expert. Microsoft should know better than to conflate the two, especially in a tutorial intended for a lay audience.

GPG?

This post isn’t supposed to be a tutorial about GPG, but if anybody actually wants to take my advice, please check out the documentation. And if you’re looking to do file encryption in [target programming language], just do the equivalent operations with an OpenPGP implementation in that language.

Wow So PGP is Infallible?

Oh god no. It has many flaws, the biggest of which is probably the awful key distribution. It just happens to have a decent implementation of file encryption that’s available on just about every platform and in just about every programming language. If you want to be thoroughly confused about this topic, here are Matthew Green and Filippo Valsorda shitting all over PGP.

Hey, I Came Here For the Crypto Details!

Alright so now that we’re all clear that when you want to encrypt a file you should just use a mainstream, high-level crypto library and not touch the primitives yourself, here is a basic way to implement file encryption that isn’t completely terrible:

Select a secure, modern block cipher. I’d stick with AES128 unless you’ve got a really compelling reason to use something else. You can use AES256 if you want but realistically it’s not going to increase security. Your password will have nowhere near 128 bits of entropy so it’s going to be by far the weakest link in this chain. Obtain a cryptographically secure random value to use as an initialization vector. (hint: use /dev/urandom) Obtain a high entropy key. In this case I’d say select a strong password (another topic entirely), then run it through bcrypt for key stretching. For simplicity you could use the IV as the salt. Using your block cipher, the key created from sending your password through bcrypt, and the IV generated from /dev/urandom, perform cipher block chaining to encrypt your plaintext. Append the IV to the beginning of the ciphertext, then run that whole thing through HMAC, preferably using SHA2 or SHA3. Note that we’re authenticating after encrypting, avoiding the doom principle. Append the output of the HMAC to the IV and ciphertext, and output the result. So your output should be IV || CT || HMAC(IV || CT)

There you go, implementing file encryption in 6 easy steps. If you need a library that contains the primatives mentioned here, OpenSSL is a good option . If you want to decrypt, verify the HMAC by recomputing it from the IV || CT and compare with the given HMAC, then just undo the CBC encryption.

That’s Cool…But Why Did We Do All That Stuff?

The randomized IV and cipher block chaining achieve semantic security so that we don’t end up leaking information. The HMAC achieves data integrity and authentication. Key stretching with bcrypt makes the key generated from your password slightly less terrible.

If you want to learn by playing around with these things programmatically (and you can tolerate a webapp with a UI designed by hackers) checkout https://id0-rsa.pub/.