Who’s surfing who? plain picture

THE web is watching you. Chunks of code hide inside every website, tracking your online behaviour.

Now, a pair of computer scientists have published their attempt to spy back. They audited 1 million of the most popular websites for tracking behaviours – more than anyone has looked at before. Their investigation gives new insight not only into what sites might know about you, but how they’re figuring it out.

Studying a million websites is hard. To do it, Arvind Narayanan – who heads the Web Transparency and Accountability Project at Princeton University – built a tool called OpenWPM with graduate student Steven Englehardt. OpenWPM can visit and log in to websites automatically, taking more than a dozen measurements of each one. It took two weeks to crawl through the top million websites, as ranked by web traffic firm Alexa.


Narayanan and Englehardt discovered that many trackers are sharing the information they gather with at least one other party, sometimes dozens of times. The audit also revealed several previously unknown “fingerprinting” techniques that sites are using. Here, the website asks the browser to perform a task that is hidden from the user. The site then fingerprints individual machines based on slight differences in their performance. Trackers used to do this by watching how the browser draws a graphic; now, they check what fonts are installed or how the browser processes audio. A couple of trackers even gathered the device’s battery level.

“The audit found that some websites were asking for data on a visiting device’s battery level“

Tracking lets websites serve targeted ads, personalise what users see, or even price products differently. Audits like this one can make the process behind these behaviours more transparent, says Narayanan.

“You often don’t know how much tracking is going on, who’s doing the tracking, or what data they’re collecting about you and what that will be used for,” he says. “There needs to be external oversight, somebody holding companies’ feet to the fire.”

Overall, they discovered more than 81,000 third-party trackers. News websites had the most, on average. Adult websites and those owned by government agencies and universities tended to have the fewest.

Information like this could be helpful for privacy tools like Ghostery, a popular browser extension that blocks trackers, says Narayanan. “A big part of our research is helping [software] like Ghostery,” he says. “Tools like this can block only the known stuff, not the unknown stuff.”

David Choffnes of Northeastern University in Boston says it’s hard to be surprised by revelations like this when web tracking is so ubiquitous. “Is it frustrating and disappointing? Very much,” he says. “Such studies are important to keep consumers aware of privacy risks while browsing the web, informing regulators, and guiding the design of countermeasures for those who do not want to be tracked.”

This article appeared in print under the headline “How websites take your fingerprint on the sly”