Recently, Tobias Pulls and Rasmus Dahlberg published a paper entitled Website Fingerprinting with Website Oracles.

"Website fingerprinting" is a category of attack where an adversary observes a user's encrypted data traffic, and uses traffic timing and quantity to guess what website that user is visiting. In this attack, the adversary has a database of web pages, and regularly downloads all of them in order to record their traffic timing and quantity characteristics, for comparison against encrypted traffic, to find potential target matches.

Practical website traffic fingerprinting attacks against the live Tor network have been limited by the sheer quantity and variety of all kinds (and combinations) of traffic that the Tor network carries. The paper reviews some of these practical difficulties in sections 2.4 and 7.3.

However, if specific types of traffic can be isolated, such as through onion service circuit setup fingerprinting, the attack seems more practical. This is why we recently deployed cover traffic to obscure client side onion service circuit setup.

To address the problem of practicality against the entire Internet, this paper uses various kinds of public Internet infrastructure as side channels to narrow the set of websites and website visit times that an adversary has to consider. This allows the attacker to add confidence to their classifier's guesses, and rule out false positives, for low cost. The paper calls these side channels "Website Oracles".

As this table illustrates, several of these Website Oracles are low-cost/low-effort and have high coverage. We're particularly concerned with DNS, Real Time Bidding, and OCSP.

All of these oracles matter to varying degrees for non-Tor Internet users too, particularly in instances of centralized plaintext services. Because both DNS and OCSP are in plaintext, and because it is common practice for DNS to be centralized to public resolvers, and because OCSP queries are already centralized to the browser CAs, DNS and OCSP are good collection points to get website visit activity for large numbers of Internet users, not just Tor users.

Real Time Bidding ad networks are also a vector that Mozilla and EFF should be concerned about for non-Tor users, as they leak even more information about non-Tor users to ad network customers. Advertisers need not even pay anything or serve any ads to get information about all users who visit all sites that use the RTB ad network. On these bidding networks, visitor information is freely handed out to help ad buyers decide which users/visits they want to serve ads to. Nothing prevents advertisers from retaining this information for their own purposes, which also enables them to mount attacks, such as the one Tobias and Rasmus studied.

In terms of mitigating the use of these vectors in attacks against Tor, here's our recommendations for various groups in our community: