On Tuesday, Donald Trump took to Twitter to draw attention to an important story about a large scale National Security Agency surveillance program—though largely for the wrong reasons.



Wow! The NSA has deleted 685 million phone calls and text messages. Privacy violations? They blame technical irregularities. Such a disgrace. The Witch Hunt continues! — Donald J. Trump (@realDonaldTrump) July 3, 2018

There are two significant errors here and one important truth. The first error is that NSA is in the process of deleting “Call Detail Records”—or metadata—about phone calls and text messages, not the calls and messages themselves. The second error is that the records being purged were acquired pursuant to a counterterrorism authority: They have nothing to do with Special Counsel Robert Mueller’s investigation into Russian interference in the 2016 presidential election (or “The Witch Hunt,” as Trump is fond of characterizing it). The important truth is that the dramatic purge of hundreds of millions of records is indeed an attempt to remedy “technical irregularities” that led to privacy violations: the acquisition by NSA of large numbers of private telecommunications records it was not legally entitled to receive. The New York Times reported on this in late June, and David Kris, former head of the Justice Department’s National Security Division, wrote a helpful explanatory post for Lawfare.





I can offer a bit of additional perspective because I participated in a call held by intelligence officials with privacy advocates on Monday, July 2nd to discuss the CDR deletion. As those officials described it, the issue arose as a result of problems in the systems of the telecommunications providers from whom these records were sought. This introduced errors into some of those records, which were not immediately detected because the problem only manifested itself under specific circumstances—the officials were, understandably, unable to go into any technical detail here. But when those records were subsequently fed back into the system as part of regular requests for additional records, those initial errors were compounded, leading to NSA obtaining records it was not authorized to receive. Since the erroneous records are difficult to identify, NSA opted to purge the entire database, though now that the technical issues have been resolved, they will be able to repopulate it by re‐​requesting records for which there’s still an applicable Foreign Intelligence Surveillance Court order in place.





That’s what we got on the call, but it’s probably necessary to have a bit of background to understand what it really means in practice. Until 2015, as we learned from disclosures by former NSA contractor Edward Snowden, NSA was collecting nearly all domestic calling records in bulk. When analysts had a phone number they believed was linked to a foreign terror group, they queried this number against that massive database, pulling up not only the phone numbers in direct contact with that initial number, but the numbers in contact with those numbers, and then the numbers in communication with those second‐​degree contacts. In short, they built up a kind of social network graph extending out to three degrees of separation, or three “hops”, from an initial suspicious number. The USA Freedom Act of 2015 put an end to the practice of indiscriminate bulk collection, instead allowing NSA to obtain two “hops” worth of contacts from an initial suspicious number by making requests, approved by the Foreign Intelligence Surveillance Court, directly to telecommunications providers, rather than maintaining an enormous database itself.





So imagine what happens if (and here I’m necessarily speculating a bit about the nature of the errors) some technical glitch causes records to include wrong numbers—numbers the target of that FISC order wasn’t really in communication with. At that step, NSA has gotten records it is legally entitled to, though they contain some inaccuracies. But when those wrong numbers are fed back to the telecoms the NSA ends up improperly receiving the call records of people who were not really in direct contact with the initial target—records NSA lacks legal authority to collect.





The first thing to note about this situation is that it’s still less violative of privacy than the initial NSA bulk program, under which the agency received everyone’s call records. The second is that it’s a good illustration of how small errors can easily be magnified by feedback loops when doing telecommunications analysis, and of how those errors can go undetected for years at a time thanks to the sheer volume of data involved. Needless to say, had NSA been authorized to seek an additional third “hop,” those additional iterations would have magnified the improper collection by another order of magnitude. The secrecy around the process means that the providers from whom records are sought may have limited ability to detect errors when they occur—but also that they have few real incentives to scrupulously limit production: They face no legal liability for coughing up too much information about their customers. And while these errors may not have anything to do with Donald Trump’s “Witch Hunt,” it seems entirely possible they may have led to innocent people being inappropriately subjected to further scrutiny: Since NSA itself can’t easily identify which records it obtained in error, we will probably never know.





As far as we can tell at present, this was a genuine mistake that NSA has now taken steps to correct. But the fact that it persisted unnoticed for as long as it did, and the way the iterative nature of the program worked to magnify small initial errors, should make us skeptical of the wisdom of relying on intelligence tools that require this kind of large scale data collection.