According to newly published documents, Canadian spies tracked thousands of travelers online for days after they left an unnamed Canadian airport.

This revelation, gleaned from 2012 slides (PDF) provided by whistleblower Edward Snowden, shows that the Communications Security Establishment Canada (CSEC) conducted a real-world test that began with a “single seed Wi-Fi IP address” from an “international airport” and assembled a “set of user IDs seen on network address over two weeks.”

The technique appears to be related to one outlined by University of California San Diego and Microsoft researchers in a 2010 research paper (PDF).

It seems that CSEC used this technique to conduct a real-world test of data analysis across a "modest size city" to look for a fake "kidnapper" making ransom calls. CSEC's conclusion was that while the technique was "successful experimentally," it was "too slow to allow for practical productization."

The authority then switched to something called the "Collaborative Analytics Research Environment," described as a "big-data system being trialed at CSEC (with NSA launch assist)."

And its conclusion? "A new needle-in-a-haystack analytic is viable: contact chaining across air gaps. Enabled by sweep capability of IP profiling. Should test further to understand robustness with respect to loosening assumption of target behaviour. Beyond kidnapping, tradecraft could also be used for any target that makes occasional forays into other cities/regions."

“Repeat to cover whole world”

The documents do not describe exactly what the “user ID” precisely is. But computer security researchers believe that it could either be persistent tracking cookies, or another unique identifier—such as an e-mail login—that would be consistently transmitted in the clear.

The documents do say that “two weeks worth of ID-IP data [came] from Canadian Special Source.” The precise name of that source is redacted in the version published by the Canadian Broadcasting Corporation (CBC).

Then, the CSEC “follow[ed] IDs backward and forward in recent time,” tracking them to “local hotels, domestic airports, local transportation hubs, local Internet cafés,” as well as “other international airports, domestic airports, major international hotels.”

CSEC concluded: “Can then take seeds from these airports and repeat to cover whole world.”

Why would spy agencies want this? “Analytic can hop-sweep through IP address space to identify set of IP addresses for hotels and airports, detecting target presence within set will trigger an urgent alert.”

In a statement on its website, CSEC maintained that it is:

[L]egally authorized to collect and analyze metadata. In simple terms, metadata is technical information used to route communications, and not the contents of a communication. The classified document in question is a technical presentation between specialists exploring mathematical models built on everyday scenarios to identify and locate foreign terrorist threats. The unauthorized disclosure of tradecraft puts our techniques at risk of being less effective when addressing threats to Canada and Canadians. It is important to note that no Canadian or foreign travellers were tracked. No Canadian communications were, or are, targeted, collected or used. And all CSE activities include measures to protect the privacy of Canadians.

Ron Deibert, one of Canada's top technical experts and the director of the Citizen Lab at the University of Toronto, wrote in The Globe and Mail:

What’s this mean for Canadians? When you go to the airport and flip open your phone to get your flight status, the government could have a record. When you check into your hotel and log on to the Internet, there’s another data point that could be collected. When you surf the Web at the local cafe hotspot, the spies could be watching. Even if you’re just going about your usual routine at your place of work, they may be following your communications trail.

“Particularly clever”

Nicholas Weaver, a computer security researcher at the International Computer Science Institute based in Berkeley, California, speculated that CSEC could have collected the relevant data in one of two ways.

“The real interesting thing is not the analysis of the data—a lot of people could do this,” he told Ars. “The question is: how did they acquire the raw data for the analysis? They likely acquired the raw data in one of two ways. One, wiretapping close to the analytics side and pulling the side off the wire, which is easy to do. Or two, they simply acquired the data from somebody who’s already collecting. It could be purchased just as easily as it could be compelled.”

For example, he said, an analytics firm could have handed over a data set including unique tracking cookies (such as the UTMA cookie that Google uses) and IP addresses, as well as dates and times. Then, the spies could have followed that cookie as it popped up again when that computer or mobile device logged onto a different network at a different time.

Using the bulk data itself enables the spies to determine which IP addresses correspond to different fixed locations, such as other airports, hotels, cafés and businesses. While public Internet searches sometimes turn up associations between known IP addresses and known locations, that information is often wrong. The goal of this new CSEC technique appears to provide much higher reliability and trustworthiness of this type of data linking movements to locations.

“What the question is: given this [unique ID] as a raw data input, what can you infer about the usage about IP addresses scattered throughout the world?” he said. “You have an existing database from IP geolocation data, but you want to know what role this IP address has. That’s why this is particularly clever.”

Christopher Parsons, a fellow at the Citizen Lab at the University of Toronto, a group that helped review the documents, added that while using corporate analytics may have been one possible attack vector, there could have been another.

“There’s a series of different kinds of identifiers—that’s not entirely clear from the documents,” he told Ars.

“It’s also theoretically possible that [CSEC] may be tapping into other identifiers. There’s going to be some global database that they’re pulling from. Whether it’s going to be cookies or another identifier. My thought would be [if not cookies] that if they’re looking for particular chat user names or e-mail that is also sent in clear or sent in clear often enough. One of [the] pieces about this [is] that it seems to indicate that it’s the act of logging on. It’s not clear that you have to make some particular action, it’s that the device[s] are likely to be sending out this kind of information upstream. It is possible that it’s your username every time you hit the mail server.”

He also noted that in Canada, the two major ISPs—Bell and Rogers—provide, by default, e-mail accounts on Microsoft and Yahoo, respectively.

So, he speculated, if CSEC was going to use such an e-mail username for instance, “that ISP is going to have a litany of personal information about a Canadian target, billing and everything else that they hold, whereas the cookie information may not provide [all that information.]”

Both Parsons and Weaver also added that the use of Tor, VPNs, and anti-tracking software (such as browser plugins like Disconnect or Ghostery) may help to somewhat thwart this type of tracking.

“This is one of the most pernicious things on the Web, is that the Web is built on tracking,” Weaver observed. “As I like to say: ‘Facebook likes your taste in porn.’ The like button tracks the pages you read. It’s not even anonymous tracking but explicit. Web tracking is a serious problem because it enables all sorts of evil stuff. Basically the NSA and company are leveraging that infrastructure.”