Latest Leak Shows NSA Can Collect Nearly Any Internet Activity Worldwide Without Prior Authorization

from the the-NSA-should-really-stop-issuing-denials dept

The newest NSA leak has just been posted at the Guardian and it gives credence to Snowden's earlier claim the he could, "from his desk," wiretap nearly anyone in the world. US officials, including NSA apologist/CISPA architect/Internet hater Mike Rogers, denied Snowden's claim, with Rogers going so far as to call the former NSA contractor a liar. The documents leaked today seem to indicate otherwise.

A top secret National Security Agency program allows analysts to search with no prior authorization through vast databases containing emails, online chats and the browsing histories of millions of individuals, according to documents provided by whistleblower Edward Snowden.



The NSA boasts in training materials that the program, called XKeyscore, is its "widest-reaching" system for developing intelligence from the internet.



[T]raining materials for XKeyscore detail how analysts can use it and other systems to mine enormous agency databases by filling in a simple on-screen form giving only a broad justification for the search. The request is not reviewed by a court or any NSA personnel before it is processed.

The purpose of XKeyscore is to allow analysts to search the metadata as well as the content of emails and other internet activity, such as browser history, even when there is no known email account (a "selector" in NSA parlance) associated with the individual being targeted.



Analysts can also search by name, telephone number, IP address, keywords, the language in which the internet activity was conducted or the type of browser used.



One document notes that this is because "strong selection [search by email address] itself gives us only a very limited capability" because "a large amount of time spent on the web is performing actions that are anonymous."

E.g., Someone whose language is out of place for the region they are in

Someone who is using encryption

Someone searching the web for suspicious stuff

The quantity of communications accessible through programs such as XKeyscore is staggeringly large. One NSA report from 2007 estimated that there were 850bn "call events" collected and stored in the NSA databases, and close to 150bn internet records. Each day, the document says, 1-2bn records were added.



The XKeyscore system is continuously collecting so much internet data that it can be stored only for short periods of time. Content remains on the system for only three to five days, while metadata is stored for 30 days. One document explains: "At some sites, the amount of data we receive per day (20+ terabytes) can only be stored for as little as 24 hours."

In recent years, the NSA has attempted to segregate exclusively domestic US communications in separate databases. But even NSA documents acknowledge that such efforts are imperfect, as even purely domestic communications can travel on foreign systems, and NSA tools are sometimes unable to identify the national origins of communications.



Moreover, all communications between Americans and someone on foreign soil are included in the same databases as foreign-to-foreign communications, making them readily searchable without warrants.



Some searches conducted by NSA analysts are periodically reviewed by their supervisors within the NSA. "It's very rare to be questioned on our searches," Snowden told the Guardian in June, "and even when we are, it's usually along the lines of: 'let's bulk up the justification'."

Thank you for reading this Techdirt post. With so many things competing for everyone’s attention these days, we really appreciate you giving us your time. We work hard every day to put quality content out there for our community. Techdirt is one of the few remaining truly independent media outlets. We do not have a giant corporation behind us, and we rely heavily on our community to support us, in an age when advertisers are increasingly uninterested in sponsoring small, independent sites — especially a site like ours that is unwilling to pull punches in its reporting and analysis. While other websites have resorted to paywalls, registration requirements, and increasingly annoying/intrusive advertising, we have always kept Techdirt open and available to anyone. But in order to continue doing so, we need your support. We offer a variety of ways for our readers to support us, from direct donations to special subscriptions and cool merchandise — and every little bit helps. Thank you.

–The Techdirt Team

Greenwald isn't kidding about the "broad justification." The slides tout the breadth of the search program, which provides results other programs can't. As is stated in the opening slides, XKeyscore allows agents to pull up tons of data (in search of "anomalies") and work backward to refine the results. The justification for these broad searches is available via a pulldown menu, as can (sort of) be seen in this screenshot, which gives agents a variety to choose from. (From the list, it appears that anything ending with "outside the US" is fair game.)XKeyscore utilizes a variety of plugins to allow searches, including email addresses, phone numbers, IP addresses, full logs of every DNI session and machine-specific cookies. This gives agents an advantage other surveillance programs don't.The slides warn that the data collected will be too large to parse (or even store for a great length of time). It recommends harvesting first and "selecting" second, in order to refine the results (using a "Strong Selector"). Agents are directed to look for "anomalous events," some of which seem a bit troubling.These "anomalies" are common enough that plenty of non-terrorists will be getting a second look from agents utilizing this program. And again we see the NSA's instant distrust of anyone using encryption. This is one of the hazards of " collecting it all " and then working backwards. It's easy to make common behavior look suspicious if you start at an end assumption and connect the dots in reverse.Also troubling are some of the suggested applications of the search program shown in the slide deck, including "show me all the VPNs startups in Country X" and "show me all exploitable machines in Country X."On top of this, there's the sheer breadth of the program.Because of the massive size of the data haul, metadata is retained and stored longer while more specific data is released. This still allows agents to perform broad searches to gather as much data as possible while relying on the stored metadata to put other connections together. Once they have the connections, the shallow search can be better utilized with the "strong selectors."The data harvested isn't solely relegated to foreign communications, no matter what the pulldown menu says. The power of the database pretty much guarantees the inadvertent collection of data on American citizens. This is exacerbated by the fact that some web traffic will be indeterminate in origin or termination. This leads to violations of the few laws that do pertain to NSA data collection, something the NSA documents admit is a problem. Of course, as Snowden pointed out, there's always a solution.Speaking of "justification," the slides claim that over 300 terrorists have been caught using XKeyscore. And the NSA has responded to the Guardian's leak with the usual claims that everything here is legal and audited, etc., which, again, doesn't make it right or even constitutional. It just makes it what it is: the end result of more than a decade's worth of expansion, secret law interpretations and compliant administrations.

Filed Under: metadata, nsa, nsa surveillance, search, surveillance, xkeystone