Automate and chill

Wendy M Grossman looks at why storing communication data is an invasion of privacy, even if it does not come into contact with human eyes.

Image: Silicon Valley Art by Scott Johnson CC BY-NC-ND 2.0

Is it an invasion of privacy if your intimate communications are stored and scanned by an automated system but not seen by a human? Here I want to nail down why the answer is yes.

This question - and much else - came up at Wednesday evening's debate between the QC Geoffrey Robinson, the investigative journalist Duncan Campbell, and David Omand, former head of GCHQ and security and intelligence coordinator for the Cabinet Office (2002 to 2005). We were in Chatham House, home of the Rule keeping backroom deals between the "great and good" secret, but not of it: we were encouraged to tweet (except there was no wifi and many mobile phones didn't work in that basement …but whatever).

The person who brought it up on Wednesday was David Omand in response to a questioner who suggested that given today's tools "man's fallible temptation to delve" might take over, nullifying any rules created to regulate access to the collected piles of data.

"We could almost advance the argument that we're safer with computers because they're not conscious," Omand said. "They don't actually read the stuff. If there were a human - or a thousand humans - reading it they might be tempted." In the Guardian, last month he similarly wrote:

This involves computers searching through a mass of material, of course, and that might include your and my emails and data on our web traffic, but it is only the legally requested material that ever gets seen by a human being. These computers are not conscious beings: they will only select that which they are lawfully programmed to select. To describe this process as monitoring all our communications or "the surveillance state" or a "snooper's charter" is wholly misleading and a perverse reading of the situation.

So Omand's contention is that computers can't be tempted to break rules because they don't get curious and can't be bribed ("Free electricity"), blackmailed ("I'll tell other machines you're having an affair with that Galaxy Note II"), or socially engineered ("Nude pics of Anna Robotova - click here!"). He is also claiming that the piles of data do not matter until or unless human assessment is involved - and apparently assuming that such involvement will always be legal.

The first obvious response is the most general: clearly Europe fundamentally disagrees or it wouldn't have put such effort into enshrining the data protection principles into law. That law does not distinguish between machine and human access; it assumes that all processing, no matter how automated, has a human ultimately in charge.

The claim that automatic scanning is less invasive is a justification, not a fact. The focus on content is a decoy. Easily accepted, because most people readily imagine the unpleasant consequences if something they've just written were read by the wrong person: the note asking for help fixing your boss's mistake; the explicit note to a lover; financial figures; 250,000 diplomatic cables... It is much harder to feel that same punch to the gut with respect to analyzing metadata - yet the consequences of the latter may be much worse. One explicit email can be explained; the fact that your mobile phones are seen together at the same hotel every Wednesday afternoon can change your life. The ACLU's Jay Stanley uses the term "reverberations" to express the fact that the privacy issue is less about who or what sees the data than about what happens to you as a result. As he writes , knowing we are being watched - by whatever - chills our behavior..

Present limitations of natural language processing and artificial intelligence mean that machines presently suck at content analysis. Traffic data, however, is perfect for machines. So when Obama - or Omand - says "no human is reading the content", they're glossing over the real game here: Big Data is today's industry buzzword. This tactic of diversion is very like the UK government's repeated insistence in the mid-2000s that the ID card was no threat because citizens would not be forced to *carry* it. As campaigners understood, the real game was the underlying database.

As long as you have humans in the loop deciding what queries are run, the "temptation to delve" still applies - remember, that perfectly functioning, omniscient, black-box, tamper-proof Machine on Person of Interestis fictional. The human using and programming the machine will always be a target for bribery, blackmail, or deception and the machine or its databases can be hacked, corrupted, or bug-ridden.

And: you'd better hope there are humans in the loop because otherwise you've got machines making intimate decisions about people's lives. It is no consolation that no human has read your email if automated processing adds your name to the no-fly list, falsely accuses you, wrecks your credit score, or decides you're fit for work and denies your benefits claim. The bad news is that too many of those humans blindly trust the machine, as Danielle Citron and others have established, because it's a safe way not to get fired.

Ultimately, behind every great machine stands a human with a finger on the panic button. It's sophistry to pretend otherwise.

Wendy M. Grossman’s Web site has an extensive archive of her books, articles, and music, and an archive of earlier columns in this series. Stories about the border wars between cyberspace and real life are posted occasionally during the week at the net.wars Pinboard - or follow on Twitter.