The last couple of months have been a bit slow in the blogging department. It’s hard to blog when there are exciting things going on. But also: I’ve been a bit blocked. I have two or three posts half-written, none of which I can quite get out the door.

Instead of writing and re-writing the same posts again, I figured I might break the impasse by changing the subject. Usually the easiest way to do this is to pick some random protocol and poke at it for a while to see what we learn.

The protocols I’m going to look at today aren’t particularly ‘random’ — they’re both popular encrypted instant messaging protocols. The first is OTR (Off the Record Messaging). The second is Cryptocat’s group chat protocol. Each of these protocols has a similar end-goal, but they get there in slightly different ways.

I want to be clear from the start that this post has absolutely no destination. If you’re looking for exciting vulnerabilities in protocols, go check out someone else’s blog. This is pure noodling.

The OTR protocol

OTR is probably the most widely-used protocol for encrypting instant messages. If you use IM clients like Adium, Pidgin or ChatSecure, you already have OTR support. You can enable it in some other clients through plugins and overlays.

OTR was originally developed by Borisov, Goldberg and Brewer and has rapidly come to dominate its niche. Mostly this is because Borisov et al. are smart researchers who know what they’re doing. Also: they picked a cool name and released working code.

OTR works within the technical and usage constraints of your typical IM system. Roughly speaking, these are:

Messages must be ASCII-formatted and have some (short) maximum length. Users won’t bother to exchange keys, so authentication should be “lazy” (i.e., you can authenticate your partners after the fact). Your chat partners are all FBI informants so your chat transcripts must be plausibly deniable — so as to keep them from being used as evidence against you in a court of law.

Coming to this problem fresh, you might find goal (3) a bit odd. In fact, to the best of my knowledge no court in the history of law has ever used a cryptographic transcript as evidence that a conversation occurred. However it must be noted that this requirement makes the problem a bit more sexy. So let’s go with it!

“Dammit, they used a deniable key exchange protocol” said no Federal prosecutor ever. The OTR (version 2/3) handshake is based on the SIGMA key exchange protocol. Briefly, it assumes that both parties generate long-term DSA public keys which we’ll denote by (pubA, pubB). Next the parties interact as follows:

The OTRv2/v3 AKE. Diagram by Bonneau and Morrison, all colorful stuff added. There’s also

an OTRv1 protocol that’s too horrible to talk about here.

There are four elements to this protocol:

Hash commitment. First, Bob commits to his share of a Diffie-Hellman key exchange (g^x) by encrypting it under a random AES key r and sending the ciphertext and a hash of g^x over to Alice. Diffie-Hellman Key Exchange. Next, Alice sends her half of the key exchange protocol (g^y). Bob can now ‘open’ his share to Alice by sending the AES key r that he used to encrypt it in the previous step. Alice can decrypt this value and check that it matches the hash Bob sent in the first message.Now that both sides have the shares (g^x, g^y) they each use their secrets to compute a shared secret g^{xy} and hash the value several ways to establish shared encryption keys (c’, Km2, Km’2) for subsequent messages. In addition, each party hashes g^{xy} to obtain a short “session ID”. The sole purpose of the commitment phase (step 1) is to prevent either Alice or Bob from controlling the value of the shared secret g^{xy}. Since the session ID value is derived by hashing the Diffie-Hellman shared secret, it’s possible to use a relatively short session ID value to authenticate the channel, since neither Alice nor Bob will be able to force this ID to a specific value. Exchange of long-term keys and signatures. So far Alice and Bob have not actually authenticated that they’re talking to each other, hence their Diffie-Hellman exchange could have been intercepted by a man-in-the-middle attacker. Using the encrypted channel they’ve previously established, they now set about to fix this.Alice and Bob each send their long-term DSA public key (pubA, pubB) and key identifiers, as well as a signature on (a MAC of) the specific elements of the Diffie-Hellman message (g^x, g^y) and their view of which party they’re communicating with. They can each verify these signatures and abort the connection if something’s amiss.** Revealing MAC keys. After sending a MAC, each party waits for an authenticated response from its partner. It then reveals the MAC keys for the previous messages. Lazy authentication. Of course if Alice and Bob never exchange public keys, this whole protocol execution is still vulnerable to a man-in-the-middle (MITM) attack. To verify that nothing’s amiss, both Alice and Bob should eventually authenticate each other. OTR provides three mechanisms for doing this: parties may exchange fingerprints (essentially hashes) of (pubA, pubB) via a second channel. Alternatively, they can exchange the “session ID” calculated in the second phase of the protocol. A final approach is to use the Socialist Millionaires’ Problem to prove that both parties share the same secret.

The OTR key exchange provides the following properties:

Protecting user identities. No user-identifying information (e.g., long-term public keys) is sent until the parties have first established a secure channel using Diffie-Hellman. The upshot is that a purely passive attacker doesn’t learn the identity of the communicating partners — beyond what’s revealed by the higher-level IM transport protocol.*

Unfortunately this protection fails against an active attacker, who can easily smash an existing OTR connection to force a new key agreement and run an MITM on the Diffie-Hellman used during the next key agreement. This does not allow the attacker to intercept actual message content — she’ll get caught when the signatures don’t check out — but she can view the public keys being exchanged. From the client point of view the likely symptoms are a mysterious OTR error, followed immediately by a successful handshake.

One consequence of this is that an attacker could conceivably determine which of several clients you’re using to initiate a connection.

Weak deniability . The main goal of the OTR designers is plausible deniability. Roughly, this means that when you and I communicate there should be no binding evidence that we really had the conversation. This rules out obvious solutions like GPG-based chats, where individual messages would be digitally signed, making them non-repudiable.

Properly defining deniability is a bit complex. The standard approach is to show the existence of an efficient ‘simulator’ — in plain English, an algorithm for making fake transcripts. The theory is simple: if it’s trivial to make fake transcripts, then a transcript can hardly be viewed as evidence that a conversation really occurred.

OTR’s handshake doesn’t quite achieve ‘strong’ deniability — meaning that anyone can fake a transcript between any two parties — mainly because it uses signatures. As signatures are non-repudiable, there’s no way to fake one without actually knowing your public key. This reveals that we did, in fact, communicate at some point. Moreover, it’s possible to create an evidence trail that I communicated with you, e.g., by encoding my identity into my Diffie-Hellman share (g^x). At very least I can show that at some point you were online and we did have contact.

But proving contact is not the same thing as proving that a specific conversation occurred. And this is what OTR works to prevent. The guarantee OTR provides is that if the target was online at some pointand you could contact them, there’s an algorithm that can fake just about any conversation with the individual. Since OTR clients are, by design, willing to initiate a key exchange with just about anyone, merely putting your client online makes it easy for people to fake such transcripts.***

Towards strong deniability. The ‘weak’ deniability of OTR requires at least tacit participation of the user (Bob) for which we’re faking the transcript. This isn’t a bad property, but in practice it means that fake transcripts can only be produced by either Bob himself, or someone interacting online with Bob. This certainly cuts down on your degree of deniability.

A related concept is ‘strong deniability‘, which ensures that any party can fake a transcript using only public information (e.g., your public keys).

OTR doesn’t try achieve strong deniability — but it does try for something in between. The OTR version of deniability holds that an attacker who obtains the network traffic of a real conversation — even if they aren’t one of the participants — should be able alter the conversation to say anything he wants. Sort of.

The rough outline of the OTR deniability process is to generate a new message authentication key for each message (using Diffie-Hellman) and then reveal those keys once they’ve been used up. In theory, a third party can obtain this transcript and — if they know the original message content — they can ‘maul’ the AES-CTR encrypted messages into messages of their choice, then they can forge their own MACs on the new messages.

OTR message transport (source: Bonneau and Morrison, all colored stuff added).

Thus our hypothetical transcript forger can take an old transcript that says “would you like a Pizza“ and turn it into a valid transcript that says, for example, “would you like to hack STRATFOR“… Except that they probably can’t, since the first message is too short and… oh lord, this whole thing is a stupid idea — let’s stop talking about it.

The OTRv1 handshake . Oh yes, there’s also an OTRv1 protocol that has a few issues and isn’t really deniable. Even better, an MITM attacker can force two clients to downgrade to it, provided both support that version. Yuck.

So that’s the OTR protocol. While I’ve pointed out a few minor issues above, the truth is that the protocol is generally an excellent way to communicate. In fact it’s such a good idea that if you really care about secrecy it’s probably one of the best options out there.

Cryptocat

Since we’re looking at IM protocols I thought it might be nice to contrast with another fairly popular chat protocol: Cryptocat‘s group chat. Cryptocat is a web-based encrypted chat app that now runs on iOS (and also in Thomas Ptacek’s darkest nightmares).

Cryptocat implements OTR for ‘private’ two-party conversations. However OTR is not the default. If you use Cryptocat in its default configuration, you’ll be using its hand-rolled protocol for group chats.

The Cryptocat group chat specification can be found here, and it’s remarkably simple. There are no “long-term” keys in Cryptocat. Diffie-Hellman keys are generated at the beginning of each session and re-used for all conversations until the app quits. Here’s the handshake between two parties:

Cryptocat group chat handshake (current revision). Setting is Curve25519. Keys are generated when the application launches, and re-used through the session.

If multiple people join the room, every pair of users repeats this handshake to derive a shared secret between every pair of users. Individuals are expected to verify each others’ keys by checking fingerprints and/or running the Socialist Millionaire protocol.

Unlike OTR, the Cryptocat handshake includes no key confirmation messages, nor does it attempt to bind users to their identity or chat room. One implication of this is that I can transmit someone else’s public key as if it were my own — and the recipients of this transmission will believe that the person is actually part of the chat.

Moreover, since public keys aren’t bound to the user’s identity or the chat room, you could potentially route messages between a different user (even a user in a different chat room) while making it look like they’re talking to you. Since Cryptocat is a group chat protocol, there might be some interesting things you could do to manipulate the conversation in this setting.****

Does any of this matter? Probably not that much, but it would be relatively easy (and good) to fix these issues.

Message transmission and consistency. The next interesting aspect of Cryptocat is the way it transmits encrypted chat messages. One of the core goals of Cryptocat is to ensure that messages are consistent between individual users. This means that all users should be able to verify that the other user is receiving the same data as it is.

Cryptocat uses a slightly complex mechanism to achieve this. For each pair of users in the chat, Cryptocat derives an AES key and an MAC key from the Diffie-Hellman shared secret. To send a message, the client:

Pads the message by appending 64 bytes of random padding. Generates a random 12-byte Initialization Vector for each of the Nusers in the chat. Encrypts the message using AES-CTR under the shared encryption key for each user. Concatenates all of the N resulting ciphertexts/IVs and computes an HMAC of the whole blob under each recipient’s key. Calculates a ‘tag’ for the message by hashing the following data: padded plaintext || HMAC-SHA512alice || HMAC-SHA512bob || HMAC-SHA512carol || … Broadcasts the N ciphertexts, IVs, MACs and the single ‘tag’ value to all users in the conversation.

When a recipient receives a message from another user, it verifies that:

The message contains a valid HMAC under its shared key. This IV has not been received before from this sender. The decrypted plaintext is consistent with the ‘tag’.

Roughly speaking, the idea here is to make sure that every user receives the same message. The use of a hashed plaintext is a bit ugly, but the argument here is that the random padding protects the message from guessing attacks. Make what you will of this.

Anti-replay. Cryptocat also seeks to prevent replay attacks, e.g., where an attacker manipulates a conversation by simply replaying (or reflecting) messages between users, so that users appear to be repeating statements. For example, consider the following chat transcripts:

Replays and reflection attacks.

Replay attacks are prevented through the use of a global ‘IV array’ that stores all previously received and sent IVs to/from all users. If a duplicate IV arrives, Cryptocat will reject the message. This is unwieldy but it generally seems ok to prevent replays and reflection.

A limitation of this approach is that the IV array does not live forever. In fact, from time to time Cryptocat will reset the IV array without regenerating the client key. This means that if Alice and Bob both stay online, they can repeat the key exchange and wind up using the same key again — which makes them both vulnerable to subsequent replays and reflections. (Update: This issue has since been fixed).

In general the solution to these issues is threefold:

Keys shouldn’t be long-term, but should be regenerated using new random components for each key exchange. Different keys should be derived for the Alice->Bob and Bob->Alice direction It would be be more elegant to use a message counter than to use this big, unwieldy key array.

The good news is that the Cryptocat developers are working on a totally new version of the multi-party chat protocol that should be enormously better.

In conclusion

I said this would be a post that goes nowhere, and I delivered! But I have to admit, it helps to push it out of my system.

None of the issues I note above are the biggest deal in the world. They’re all subtle issues, which illustrates two things: first, that crypto is hard to get right. But also: that crypto rarely fails catastrophically. The exciting crypto bugs that cause you real pain are still few and far between. Notes:

* In practice, you might argue that the higher-level IM protocol already leaks user identities (e.g., Jabber nicknames). However this is very much an implementation choice. Moreover, even when using Jabber with known nicknames, you might access the Jabber server using one of several different clients (your computer, phone, etc.). Assuming you use Tor, the only indication of this might be the public key you use during OTR. So there’s certainly useful information in this protocol.

** Notice that OTR signs a MAC instead of a hash of the user identity information. This happens to be a safe choice given that the MAC used is based on HMAC-SHA2, but it’s not generally a safe choice. Swapping the HMAC out for a different MAC function (e.g., CBC-MAC) would be catastrophic.

*** To get specific, imagine I wanted to produce a simulated transcript for some conversation with Bob. Provided that Bob’s client is online, I can send Bob any g^x value I want. It doesn’t matter if he really wants to talk to me — by default, his client will cheerfully send me back his own g^y and a signature on (g^x, g^y, pub_B, keyid_B) which, notably, does not include my identity. From this point on all future authentication is performed using MACs and encrypted under keys that are known to both of us. There’s nothing stopping me from faking the rest of the conversation.

**** Incidentally, a similar problem exists in the OTRv1 protocol.