Crypto-misnomers: "Zero Knowledge" Considered Self-Descriptive

The past few years have brought more products and services that offer cryptography features as a selling point. More security is good, but a lot of products and services seem to ship before a cryptography expert has a chance to correct the marketing copy. Even worse: Often, these misconceptions corrupt the design and implementation of the feature itself.

Cryptography is a difficult subject for non-experts to understand, as is. Muddying the waters further is bad for everyone. If you're looking for a primer on basic cryptography terms and concepts, see the linked blog post instead.

This list is not exhaustive. This blog post will be updated and the list will be expanded as time goes on.

Misused Terminology

Zero Knowledge

Terminology misuse example: SpiderOak's page on Zero Knowledge in their marketing copy.

In layman's terms, zero knowledge is a property of authentication protocols, not encryption.

Formally, "zero knowledge" applies to mathematical proofs: You are trying to prove that you know some secret, without revealing the secret to the other person. So you both participate in a protocol, wherein the most the other person learns is:

Do they have the secret? [Yes/No]

In the practical side of cryptography, TRUE/FALSE is most often the result of an authentication decision. Was this message unmodified in transit? Was this message signed by the private key that corresponds to this public key? A good example of a zero-knowledge proof is the socialist millionaire protocol (as used in Off-The-Record messaging). An upcoming cryptocurrency, Zcash, is built on zero-knowledge proofs to facilitate digital payments that are more private than bitcoin.

Looking at the SpiderOak example: Instead of misusing "zero knowledge", cryptographers refer to the property they intended to describe as confidential. Often, such systems will be described as privacy-preserving and may offer end-to-end authenticated encryption between peers, wherein the server only sees ciphertext. These are all generally accepted ways to describe the same set of solutions to the same set of problems. (End-to-end is even a little on the buzz-wordy side, if your marketing person happens to like that sort of thing.)

Using "zero knowledge" to describe encryption is actively harmful. Even if the server only learns "the value of the ciphertext when the plaintext is encrypted under an unknown random 256 bit key", they've still gained a nonzero amount of knowledge. You will always learn, at minimum, the approximate length of the encrypted message from the ciphertext produced. Given that message length was recently used in the CRIME, BREACH, and HEIST (PDF) attacks, it's safe to say that leaking the length of the plaintext to an attacker isn't benign.

Bad Ideas

One-Time Pads

Instead of a one-time pad, consider utilizing authenticated stream ciphers.

This has been adequately covered (with an example) by a blog post titled, be wary of one-time pads and other crypto unicorns by Joseph Bonneau. To recap the arguments made:

Using one-time pads requires generating a lot of true random data and most mobile devices can't do that. We don't have secure high-bandwidth channels for sharing one-time pads which don’t rely on symmetric cryptography. One-time pads don't assure integrity.

Furthermore, even if you cast caution to the wind and attempt to implement an OTP using a pre-shared key (e.g. a USB full of random bytes from /dev/urandom ), your operating system's CSPRNG is actually a stream cipher; at this point, why not just use a stream cipher?

For example, Salsa20/20 (the predecessor to ChaCha20, which is used by /dev/urandom in the OpenBSD kernel and coming to the Linux kernel in 4.8) offers $2^{64}$ independent byte streams for any given key, which is selected by the counter, and can be used for $2^{70}$ bytes ($2^{64}$ blocks) before needing to rotate to the next counter value.

Stream ciphers are similar to a one-time pad, except you typically only need to transmit/exchange 32 bytes of secret keying material.

The weakness of a digital cryptosystem is almost certainly going to be the public key cryptography that allows a system to negotiate its secret keys, rather than the symmetric key encryption (for which you should be using a modern cipher in an AEAD mode).

Homomorphic Encryption

You probably don't want homomorphic encryption.

The past few years has seen a rise in encrypted databases, and with it, an increased interest in homomorphic encryption, which allows you to perform calculations on ciphertext without knowing the plaintext or the encryption key.

Homomorphic encryption is an area of interest to cryptography researchers and could one day pave the way to incredible innovation and practical benefits for software developers. However, products that claim to provide it today should be viewed with severe skepticism: Homomorphic encryption (HE) means sacrificing message integrity. It's a contradiction to believe that a cryptosystem can be IND-CCA3 secure while allowing the server to modify the ciphertext to perform useful calculations.

Furthermore, the threat model for most applications doesn't lend to encrypted databases.

If your application needs to use the data, it must therefore be capable of decrypting the data.

Most relevant data breaches compromise the application itself (i.e. via SQL injection or Remote Code Execution).

If you're designing an application that encrypts and ships off data for cold storage, you're better off using a sealing API instead of messing around with HE.

Finally, most implementations of HE involve unpadded RSA, AES in CBC mode with a deterministic IV and other perilous cryptography constructions. If you go down this route anyway, you will (at minimum) want your code to be thoroughly audited by cryptography experts.

Further Reading

Final Thoughts

It's perfectly okay to not understand cryptography terms. Cryptography is a challenging subject where a lot of very technical details matter tremendously and the risk of miscommunication is great.

It's entirely another thing to design and sell cryptography products with a flawed premise, or misrepresent what your product/service offers by using an incorrect or misleading term. Be careful that you're not selling security snake-oil, intentional or not.