[messaging] Opportunistic encryption and authentication methods

We've discussed methods for distributing and authenticating public-keys in email-like messaging. I'll argue that "Opportunistic Encryption" (OE) might be a good approach to this type of secure messaging, and evaluate authentication methods in that light. Background on Opportunistic Encryption ------ OE is an old concept, and people have different ideas about what it means [1,2]. My take on the core ideas: * Authenticating a public key is harder than distributing it. * Thus authentication and encryption should be decoupled, so that encryption can be deployed on a wide scale even without authentication. This has traditionally been controversial: An encrypted-but-not-authenticated connection is vulnerable to active attack, so OE might not be worth much. It might even have negative value if it gives a false sense of security. On the other hand: 1) There may be value to resisting large-scale passive eavesdropping if switching to large-scale active attack is costly. 2) OE provides a foundation on which authentication can be added (e.g. TOFU, fingerprints) [3]. 3) A small number of users performing "stealthy" authentication could protect other users by creating uncertainty about which connections can be undetectably attacked [4]. This debate has played out different ways in different protocols. For example: * STARTTLS between mail servers generally uses OE, and has some good deployment between large providers [5]. People are thinking about how to add authentication [6,7]. * HTTPS is a non-OE protocol. OE for HTTP (not HTTPS) is being proposed [8,9]. OE for email-like messaging ------ There's another argument for OE in the person-to-person case: 4) In the absence of widespread OE, users who publish their public key and encrypt conversations will draw unwanted attention. There's a new argument *against* widespread OE in the asynchronous messaging case: A key directory might get out of sync with a user, and return a public key that the user has (for example) lost the private key for. I'll contend that 1-4 make a good case for widespread OE, and the risk of messages encrypted to an out-of-sync public key is manageable: * At minimum, a service provider could implement a sort of "half-OE" by registering key pairs for users and simply holding the private keys. This would hide to outsiders whether the user had opted for full end-to-end encryption, and would provide some confidentiality for messages that flow through multiple providers (like email; this is an idea from UEE [21]). * A service provider could store most users' private keys encrypted by a password, so that even a lost device doesn't result in undecryptable messages. A user could try password cracking in the worst case of a forgotten password. * A third option is to simply give every user control of their own private key, and if they lose their device(s) then they might lose some messages sent before they upload a new key. That might be acceptable, or might not. This could be debated more, but if you accept that OE makes sense here, some principles follow: A) Since we want widespread OE the goal should be for encryption to be as frictionless as possible (ideally enabled by default, including multiple-device support), scaleable, and reliable. Users who don't care about end-to-end authentication should not be inconvenienced by it. B) Since widespread OE would limit provider-based spam and malware filtering, figuring out how to move these to the client is important [10]. C) Authentication mechanisms should be evaluated on "stealthiness" as well as useability and security. Ideally it should be hard for any observer (including service providers) to tell which conversations are authenticated and which are not. D) Authentication mechanisms will be built on top of OE, so can assume that "identity public keys" and "key directories" already exist. Evaluating authentication methods for secure messaging with OE ------ We can take the above principles and see whether different authentication methods are compatible with widespread OE for messaging. TOFU: Compatible with OE since users could "stealthily" enable notification of TOFU key changes, there's no effect on users who don't, and no scaleability issues that would inhibit widespread OE. FINGERPRINTS: Compatible with OE since users could "stealthily" communicate about fingerprints out-of-band, there's no effect on users who don't, and no scaleability issues. In conjunction with TOFU, this is Moxie's "simple thing" argument [11,12]. KEYS AS IDENTIFIERS: Using public keys or fingerprints directly as identifiers, or attaching them to identifiers, has a long history (Bitcoin, YURLs / S-Links, SMTorP, CGA, etc.). The argument is that identifiers are being exchanged anyway, so we might as well piggyback authentication data. I argue this violates the OE concept by inconveniencing users who don't care about end-to-end authentication (A). In particular, it adds costs such as: i) useability cost of dealing with long, random-looking identifiers ii) switching cost of replacing widely-distributed identifiers with new ones (in address books, memory, published materials, etc.) iii) operational cost of redistributing identifiers whenever the private key changes. If users change keys frequently due to new devices, software reinstallation, lost passwords, etc., it would be inconvenient to change email addresses every time [11,15]. PROVIDER-IMMUTABLE NAME/KEY MAPPING VIA VERIFIABLE LOG: Namecoin proposes that users register a name for their public key in a cryptocurrency-type blockchain. Once the public key is registered, it can only be changed by expiration or a chain of signatures (signing a new key, which can sign another key, etc.) There are some questionable design decisions in Namecoin [13,14], but the general idea of first-come first-serve names for public keys that are widely witnessed seems potentially useful. If these names are the user's primary identifier, then this is similar to the "keys as identifiers" approach except keys are given better names by a public infrastructure. So while this improves (i), it still violates the OE concept due to (ii) the cost of switching to new names and (iii) the operational cost of having your identifier tied to a key. Additionally, publishing all names and relying on a new infrastructure raises hard-to-answer questions about privacy, reliability, and scaleability. If these names aren't primary identifiers, but are instead exchanged out-of-band to authenticate a specific public key, then this is similar to fingerprints except keys are given better names: * my public key is "trevor_perrin_1970_email_2014 at Namecoin" * my public key is "gacuqk - aqoq - ecsag - biza - sjebre" (base32 fingerprint) But this trades off "stealth" (C), as users with named keys are advertising that they care about end-to-end authentication and might be comparing keys out-of-band. Users without named keys can probably be attacked with impunity. It's possible that the useability benefit of "named keys" instead of fingerprints might justify the infrastructure cost and loss of stealthy authentication, but the tradeoff is hard to evaluate. PROVIDER-UPDATEABLE NAME/KEY MAPPING VIA VERIFIABLE LOG: This is the idea of a "transparency log", inspired by Certificate Transparency, which is being explored by Keybase and Google's End-to-End [16,17,18]. Compared to a "provider-immutable" log, this accepts a more modest security goal (notify on key changes) so that it works with existing identifiers. Moxie argues this goal is not much different than what TOFU + fingerprints can achieve [19]. That's worth exploring more, but to me this seems different enough that it would add security. In any case, this doesn't suffer from (ii) or (iii), so the main questions regarding compatibility with OE are privacy and infrastructure cost. Privacy: Hashing identifiers won't be that effective [20], so this is asking service providers to publish identifiers for a large portion of their userbase. Infrastructure cost: - Instead of just looking up Bob's public key, Alice needs to lookup a proof-of-inclusion, which might increase the response size to 1 KB+ for large providers. - Storage of all the log data, and recalculating new logs, might be significant, depending on (frequency of log publication, frequency of key changes, size of userbase, etc). - To be practical, new keys would probably be batched into a new log every 24 hours or so, which adds a delay that's not trivial deal with. - To be effective, third-party monitors would need to download and review log entries, and it's not clear who these are and what costs they'd have to pay to keep up. ANONYMIZED LOOKUP AND AUDITING: Some projects (e.g. Nyms [22]) have suggested key lookups be performed via anonymized connections (e.g. Tor, or a similar chain of proxies). Then users could audit their own key directory just by looking up their own key. For widespread OE these lookups would be frequent. Whether the latency, reliability, and infrastructure cost of anonymizing them is acceptable seems like an open question. Conclusions ----- Not sure. The TL;DR is that there might be value to deploying end-to-end encryption at scale, even without end-to-end authentication (OE), so it would be good to have authentication methods that enhance the value of that instead of impeding it. Trevor [1] http://en.wikipedia.org/wiki/Opportunistic_encryption [2] https://datatracker.ietf.org/doc/draft-dukhovni-opportunistic-security/?include_text=1 [3] http://www.ietf.org/mail-archive/web/uta/current/msg00311.html [4] https://moderncrypto.org/mail-archive/messaging/2014/000229.html [5] https://www.eff.org/encrypt-the-web-report [6] https://github.com/jsha/starttls-everywhere/blob/master/README.md [7] https://datatracker.ietf.org/doc/draft-ietf-dane-smtp-with-dane/ [8] http://httpwg.github.io/http-extensions/encryption.html [9] http://lists.w3.org/Archives/Public/ietf-http-wg/2014JulSep/1727.html [10] https://moderncrypto.org/mail-archive/messaging/2014/000727.html [11] https://moderncrypto.org/mail-archive/messaging/2014/000718.html [12] https://moderncrypto.org/mail-archive/messaging/2014/000723.html [13] https://moderncrypto.org/mail-archive/messaging/2014/000679.html [14] https://moderncrypto.org/mail-archive/messaging/2014/000685.html [15] https://moderncrypto.org/mail-archive/messaging/2014/000234.html [16] https://moderncrypto.org/mail-archive/messaging/2014/#226 [17] https://moderncrypto.org/mail-archive/messaging/2014/000706.html [18] https://moderncrypto.org/mail-archive/messaging/2014/#708 [19] https://moderncrypto.org/mail-archive/messaging/2014/000723.html [20] https://moderncrypto.org/mail-archive/messaging/2014/000766.html [21] https://github.com/tomrittervg/uee/blob/master/proposal.md [22] http://nyms.io/