The Washington Post has a dramatic new NSA story today, one that is qualitatively different from any of the previous Edward Snowden revelations. Written by Barton Gellman, Julie Tate, and Ashkan Soltani, the story describes a large cache of intercepted communications (roughly 160,000 email and instant message exchanges) and the benefits and privacy costs of the collection they reflect. The bottom line is that the benefits were huge but the costs were big too. As the story's lead reflects,

Ordinary Internet users, American and non-American alike, far outnumber legally targeted foreigners in the communications intercepted by the National Security Agency from U.S. digital networks, according to a four-month investigation by The Washington Post. Nine of 10 account holders found in a large cache of intercepted conversations, which former NSA contractor Edward Snowden provided in full to The Post, were not the intended surveillance targets but were caught in a net the agency had cast for somebody else.

Some thoughts on the story:

This is incredibly sensitive stuff, both from a national security perspective and from a civil liberties and privacy perspective. In contrast to a lot of prior reporting, the Post here appears to have done a pretty thoughtful job of deciding what information to report and what information to withhold. It describes the sorts of information the cache contains and its value without describing a lot of details:

Among the most valuable contents---which The Post will not describe in detail, to avoid interfering with ongoing operations---are fresh revelations about a secret overseas nuclear project, double-dealing by an ostensible ally, a military calamity that befell an unfriendly power, and the identities of aggressive intruders into U.S. computer networks. Months of tracking communications across more than 50 alias accounts, the files show, led directly to the 2011 capture in Abbottabad of Muhammad Tahir Shahzad, a Pakistan-based bomb builder, and Umar Patek, a suspect in a 2002 terrorist bombing on the Indonesian island of Bali. At the request of CIA officials, The Post is withholding other examples that officials said would compromise ongoing operations.

The operational consequences of having this information outside of government hands---even if not reported to the public---will be significant.

On the privacy side, the story describes categories of invasive surveillance, even giving details and chains of correspondence about romances, but it keeps names out of it. The cache includes:

medical records sent from one family member to another, résumés from job hunters and academic transcripts of schoolchildren. In one photo, a young girl in religious dress beams at a camera outside a mosque. Scores of pictures show infants and toddlers in bathtubs, on swings, sprawled on their backs and kissed by their mothers. In some photos, men show off their physiques. In others, women model lingerie, leaning suggestively into a webcam or striking risque poses in shorts and bikini tops.

I am certain there will be consequences to individuals, as well as to NSA, for publication of this story, but by and large, the Post deserves credit for taking the sensitivity of this information seriously.

The premise of the story is a bit less thoughtful. Of course incidental collection involving non-targets will outnumber collection against targets---by a lot. The simple reason is that a single target communicates with a great many people. The percentage of that target's communications with another target will be relatively low, and there will always be incidental collection around a target that does not directly involve him or her at all: think of a target's child borrowing his cell phone or using his computer. Intelligence collection is all about sweeping in information broadly and then winnowing it down and finding the important facts. Inherent in that project---and certainly contemplated by the law in a number of respects---is the idea that you're going to collect a lot of material involving a lot of people beyond targets. It's all but mathematically certain that this will be the lion's share of the take.

The defenses against this problem are, broadly speaking, twofold. The first is that you collect and retain and disseminate information only that has a valid foreign intelligence purpose. But as the cache reflects, that does not mean you don't collect love letters. After all, if you're tracking an Australian who has gone overseas to join the Taliban, his relationship back home with a woman he might marry may well be important to understanding who he is, what he is thinking, and what he is planning. Indeed, the woman in question---whose name is mercifully withheld---doesn't seem to have much problem with the fact that a long correspondence about her love life was collected by NSA and its Australian partners:

Looking back, the young woman said she understands why her intimate correspondence was recorded and parsed by men and women she did not know. “Do I feel violated?” she asked. “Yes. I’m not against the fact that my privacy was violated in this instance, because he was stupid. He wasn’t thinking straight. I don’t agree with what he was doing.” What she does not understand, she said, is why after all this time, with the case long closed and her own job with the Australian government secure, the NSA does not discard what it no longer needs.

The story does raise an important issue about how broadly the agency should be collecting and retaining around a target. And Gellman and company do a good job of illustrating both the mathematical fact that the number of people affected by collection will be exponentially greater than the number of targets and that the information collected will quickly come to involve sensitive material of only the most tangential relationship to foreign intelligence concerns. They also note, correctly, that in the criminal wiretapping space, we draw privacy protection lines in very different ways. I'm honestly not sure how remediable this problem is in the intelligence context without huge operational consequences. But it is certainly a valid question.

The second broad protection---for U.S. persons, anyway---is minimization. As the story notes:

Nearly half of the surveillance files, a strikingly high proportion, contained names, e-mail addresses or other details that the NSA marked as belonging to U.S. citizens or residents. NSA analysts masked, or “minimized,” more than 65,000 such references to protect Americans’ privacy, but The Post found nearly 900 additional e-mail addresses, unmasked in the files, that could be strongly linked to U.S. citizens or U.S.residents. . . . At one level, the NSA shows scrupulous care in protecting the privacy of U.S. nationals and, by policy, those of its four closest intelligence allies — Britain, Australia, Canada and New Zealand. More than 1,000 distinct “minimization” terms appear in the files, attempting to mask the identities of “possible,” “potential” and “probable” U.S. persons, along with the names of U.S. beverage companies, universities, fast-food chains and Web-mail hosts. Some of them border on the absurd, using titles that could apply to only one man. A “minimized U.S. president-elect” begins to appear in the files in early 2009, and references to the current “minimized U.S. president” appear 1,227 times in the following four years. Even so, unmasked identities remain in the NSA’s files, and the agency’s policy is to hold on to “incidentally” collected U.S. content, even if it does not appear to contain foreign intelligence.

Again, the story raises a valid question: Is the agency minimizing U.S. identities and communications in all situations in which it should? The details it provides are inadequate to venture an opinion on that subject. And once again, the story raises a tension that is to some degree inherent in the agency's project: A valid overseas target who is in communication with people in the United States is, for obvious reasons, of particular interest. He will also, however, by the nature of the activity that gives rise to that interest, be in contact with more U.S. persons than many other people will. And that means that incidental collection affecting U.S. persons will be greater. Minimization is a key protection for U.S. persons, but you don't want minimization of information that may be of foreign intelligence value. Wherever you draw the line here---or, rather, the many lines---you're going to pay costs both in privacy and in effectiveness. You'll retain information that is utterly innocuous and corrosive of people's privacy and you'll minimize information that will prove to have value. The question is how much of each harm you are willing to tolerate and when you want to err on which side of the line.

Finally, I want to say a word here about the ethics of this leak. Snowden here did not leak programmatic information about government activity. He leaked many tens of thousands of personal communications of a type that, in government hands, are rightly subject to strict controls. They are subject to strict controls precisely so that the woman in lingerie, the kid beaming before a mosque, the men showing off their physiques, and the woman whose love letters have to be collected because her boyfriend is off looking to join the Taliban don't have to pay an unnecessarily high privacy price. Yes, the Post has kept personal identifying details from the public, and that is laudable. But Snowden did not keep personal identifying details from the Post. He basically outed thousands of people---innocent and not---and left them to the tender mercies of journalists. This is itself a huge civil liberties violation. And we should talk about it as such. I suspect, alas, that we won't.