Edward Snowden, whistleblower of the moment (Image: The Guardian via Getty Images)

The US government is watching every digital move that Americans make. More than 115 million people use Verizon’s cellphone service in the US, making billions of calls every year. A top-secret document revealed this week shows that the US government, through the National Security Agency, is collecting the details of every single one of those calls on a daily basis. To make matters worse, The Washington Post and The Guardian newspapers today claimed that the NSA also has direct access to the search history, email and even live chats of all customers of the world’s biggest technology firms, including Google, Apple and Facebook.

By turning over what surely amounts to billions of call logs to the US government, Verizon is enabling what is likely to be the broadest surveillance scheme in history. And the likelihood is that it is not the only one.

The secret court order was granted by the Foreign Intelligence Surveillance Court in Washington DC, which oversees surveillance requests. It forces Verizon to turn over its data. But while the order makes it clear that content – the words exchanged during calls – is not collected, that’s little comfort from a privacy perspective. Using network science, it is easy to manipulate large databases like this to figure out exactly who is behind every phone number, who they’ve talked to, when, where and for how long. The NSA probably doesn’t care to track the movements and activities of every person in the Verizon database, but the possibility is just a mouse click away.


Four calls to find you

We don’t know exactly how the NSA analyses these huge lists of records, but we do know what kinds of insights can be drawn from data sets on this scale. Yves-Alexandre de Montjoye from the Massachusetts Institute of Technology and Vincent Blondel from the Université Catholique de Louvain (UCL) in Belgium and colleagues analysed 1.5 million anonymised call records from a Western cell carrier. They showed that it takes just four calls or text messages, each made at a different time and place, to distinguish one person’s movements from everyone else’s (Nature Scientific Reports, doi.org/msd).

Patterns of communication form a digital fingerprint in time, and finding every thing, person and place you have interacted with becomes easy. Such records are exactly the kind of information we now know that Verizon, and likely every other US carrier, is handing over to the NSA on a daily basis.

Judge Roger Vinson, at the Foreign Intelligence Surveillance Court, signed an order on 25 April obliging Verizon to hand data “including but not limited to session identifying information, trunk identifier… and time and duration of call” over to the NSA on a daily basis. In a news conference on Thursday morning (6 June), Senator Dianne Feinstein confirmed that this is just a monthly renewal of a secret order which has been in effect for seven years.

Identifying information refers to the phone numbers of those making and receiving a call or text. The trunk identifier shows which cell towers the calling and receiving phones talked to – the callers’ locations, in other words. Blondel says that datasets like those Verizon is handing over could be used to build up a precise picture of different communities.

Chris Clifton, who works on data privacy at Purdue University in Indiana, says he expects the NSA doesn’t always know exactly what it’s looking for in the call metadata, but rather uses software to sort the records into groups by similarity – people who make lots of calls, for example, or people who never call abroad. Patterns in time could be useful too. If one call appears to spark off a whole flurry of other calls, that might conceivably mean the first phone number belongs to an authority figure in a criminal organisation, for instance.

They know everything

“You’re trusting the phone companies with this data like you’re trusting your bank with your financial transactions,” Blondel says. “They know when you go for surgery, divorce – they know everything.”

“Any sensible question you can ask about the call metadata would be answered in a fraction of a second by five-year-old supercomputers,” says cryptographer Daniel Bernstein from the University of Illinois, Chicago. This means the NSA’s giant supercomputing centre in Utah is massive overkill for analysing measly Verizon call logs. Perhaps it would be more useful for crunching internet data.

An NSA Powerpoint presentation discovered by The Guardian newspaper in London and the Washington Post claims that the NSA is gaining direct access to the servers of the world’s biggest tech firms to spy on internet activity. According to the slides, Google, Yahoo, Apple, Facebook and more are all signed up to a scheme, known as PRISM, which lets the NSA access their customers’ search history, chat logs and emails. The presentation says that data gained from PRISM is used to create nearly 1 in 7 of all intelligence reports. Executives of all the firms implicated have denied knowledge of any such programme and refute the allegation that they have been handing over their customers’ data in this way.

But even if the NSA does not have full internet access, it’s still relatively easy for it to access private data on the internet. Details are scarce, but there is one confirmed case where the NSA was caught in the act. An AT&T engineer named Mark Klein provided evidence that the NSA was skimming a copy of all internet traffic that passed through an AT&T data centre in San Francisco in 2003.

Now Andrew Clement and a team of information scientists at Toronto University in Canada is using that model of surveillance to try and give internet users a sense of whether and where their internet activities are being logged by the NSA. Clement’s system, called IXMaps, has aggregated thousands of traceroutes – information trails which map the paths taken by packets of data as they are directed through the routers and exchanges which make up the internet in the US.

Internet monitoring

A paper due to be presented at the International Symposium on Technology and Society in Toronto at the end of June shows that 99 per cent of internet traffic passing through the US goes through one of just 18 US cities. The paper notes that this shows it is completely feasible for the NSA to be monitoring the majority of US internet traffic with just a handful of warrantless listening posts. These would use ‘splitters’ that split the beam of light in fibre-optic cables to siphon off information. “It is powerful confirmation that it is technically feasible for the NSA to install splitters in relatively few strategic internet choke points from where it could intercept a very large proportion of internet traffic,” it says.

Nancy Paterson, who works on IXMaps with Clement, says the internet is not a random collection of network links, routing data in the most efficient way possible. Instead, the way data moves across the net is tightly controlled according to the business interests that run the subnetworks within it. This control makes blanket monitoring feasible.

“Routing isn’t what you used to call it. The best-effort internet has changed to a highly centralised, controlled space,” she says. “It’s not your grandmother’s internet.”

Although privacy protection may not seem to be on the NSA’s priority list, Clifton says he knows the organisation has people actively working on techniques which would let it analyse data effectively while not breaching privacy. “If they get too intrusive on the data people will be up in arms and they will lose access,” he says. “If they protect privacy they can get more data. They view it as part of their mission.”

De Montjoye says the NSA revelations emphasise the need for new systems which allow rich datasets like mobile phone data to be used while protecting privacy at the same time. An ongoing project in MIT, called openPDS, aims to do exactly this. OpenPDS works by only allowing third parties to ask questions of a customer dataset, never actually getting their hands on the raw data. De Montjoye says this, combined with legal systems which notify individuals when their data has been searched, and auditing systems that record who is searching for what information and when, could change the privacy debate. “I think that such a ‘mixed approach’ to privacy is the way forward,” he says.