[By Newtown grafitti under Creative Commons]

At one point in the second episode of seventh season of Through the Wormhole, a science documentary series narrated by Morgan Freeman, the camera focuses on the bewildered face of a young man. He shakes his head and mutters, “That’s really eerie.” He was referring to a feat by a researcher at Carnegie Mellon University, who took a photograph of him, a complete stranger, on a smartphone, and showed it to him moments later. The screen not only had his name but also his birthplace, interests and other personal details.

Starting with just a photograph and no other information, a facial recognition software pulled out his name, which then led to his social media profile, opening up more details—all within seconds. The camera then shifts to Alessandro Acquisti, a professor at the university who studies the economics of privacy. “Most of us post photos of ourselves online, but not everyone realises that photos are also data. And no one stole the data from us. We are willingly and publicly disclosing it.” The title of that episode: ‘Is privacy dead?’

The topic of privacy is very much alive and kicking in India these days even as a nine-judge bench at the Supreme Court is deciding whether it is a fundamental right guaranteed by our Constitution. Since the question itself came up at the court as a result of legal battles around Aadhaar, and because the same lawyers are fighting it out, many associate the subject strongly (even exclusively) with Aadhaar, India’s biometric identity programme.

However, privacy is a much bigger issue. To see it only through the lens of Aadhaar—even though it has an urgency—is to miss the bigger picture. Here are three reasons why.

The question on privacy is central to matters as varied as laws on homosexuality and clinical trials, police action and the practices of big tech companies—not just Aadhaar. The Indian government runs a number of programmes whose stated purpose is surveillance, and they have serious privacy implications. Consumers increasingly share more data with tech companies, and the capacity of tech companies to crunch that data is growing exponentially.

1. Right to privacy is central to a range of human endeavours, not just Aadhaar

Privacy is not among the six fundamental rights guaranteed by the Constitution. However, since the idea of privacy is so intertwined with the idea of personal liberty (which is guaranteed by Section 21 of our Constitution), it has come up in several court cases, including those that concern disclosure of medical records, police surveillance, media freedom and homosexuality. In deciding the Naaz Foundation case, for example, the Delhi High Court considered the argument that section 377 (which criminalises homosexual activity) violated the privacy of individuals, and was therefore unconstitutional. The following timeline not only highlights how the court’s views on privacy evolved over time, it also gives a sense of the range of cases where privacy rights come into play.

2. The Indian government runs a number of programmes for the sole purpose of surveillance—and with very little public scrutiny

In November 2008, 10 members of a terrorist outfit landed in Mumbai and carried out one of the worst attacks on Indian civilians, killing 164 people and wounding twice that number. The attacks stunned the nation. Across the world, such events have led to governments tilting the balance in favour of security and against civil liberties, and India was no different. The United Progressive Alliance (UPA) government made P Chidambaram the home minister, and he started putting in place some of the key surveillance programmes (not all) to prevent such attacks.

India thus runs at least five programmes with the sole purpose of surveillance. These include intercepting phone calls, monitoring the internet, collecting data from multiple sources and sharing them with other government agencies.

Central Monitoring System: The programme, sometimes referred to as the Indian version of the US National Security Agency’s surveillance programme codenamed PRISM, is operated by the Centre for Development of Telematics (C-DoT). It can monitor voice calls, SMS and MMS, fax communications on landlines, CDMA, video calls, GSM and 3G networks. Prior to this, the government had to ask telecom companies to monitor such communication.

Lawful Intercept And Monitoring Project: Yet another project under C-DoT, it was set up to monitor communications, internet traffic, emails, web browsing, etc. The electronic surveillance system—using systems deployed at many levels, including by internet service providers—can match keywords and key phrases in text and audio in real time.

Network Traffic Analysis System: Called Netra, the system has been in place since 2014, and is used to monitor internet data real time.

Crime and Criminal Tracking Network and Systems: Operated by the National Crime Records Bureau, the system aims to integrate the records maintained in silos by various state governments. The records will include not just those who were convicted, but also those who were investigated for a crime.

National Intelligence Grid: The programme, a part of the home affairs ministry, was set up in 2010 to give India’s security agencies real-time access to 21 citizen databases including bank account details, telephone records, passport data and vehicle registration details. Such data is already available with various agencies, and Natgrid’s aim was to integrate them. It’s not fully operational.

These systems were designed for the sole purpose of surveillance. It’s hard to find details about these programmes. We don’t know how exactly they work. We don’t know what the processes are. We don’t even know to what extent they invade our privacy.

[Aadhaar] could be the greatest poverty killer app we’ve ever seen. - Jim Yong Kim

There are two key differences between Aadhaar and these programmes. The purpose of Aadhaar is primarily to solve the problem of identity. Its aim was to give an identity that is unique, and that’s trusted across the country by the government and the private sector. The importance of having such an ID (personal, persistent, private and portable, as ID2020, a global public-private partnership, calls it) might not be evident to those of us who have the luxury of multiple IDs, but its importance in the field of development is well recognised. Jim Yong Kim, president of the World Bank, once said about Aadhaar: “This could be the greatest poverty killer app we’ve ever seen.” Secondly, Aadhaar is designed so that users will have access to and control of their own data, through a consent architecture, which is still in the works. This again could help the poor get basic public services and access to finance. It could open up economic opportunities. (A digital identity is a necessary, but not sufficient condition for development. The state must also enhance its capacity to leverage the digital identity, and that is no easy task.)

However, because the identity is digital, and because it expedites the entry of a large number of people into the digital economy, Aadhaar comes with its own risks. One of those risks is surveillance. After all, the same technology that can enhance the capacity of the government to do what’s good for its citizens, can also enhance its capacity to do what’s bad. The answer to this problem—in a country where millions go to bed hungry, and have no access to education, healthcare and finance—is not to undermine the government’s capacity, but build checks and balances to ensure that risks are mitigated.

It demands an interplay of law, technology and communication. Let’s come to that a bit later, after taking a look at what privacy means in the age of social media, mobiles, Big Bata, analytics, internet of things (IoT) and artificial intelligence (AI).

3. Consumers share increasingly more data with tech companies, and the capacity of tech companies to crunch that data is growing exponentially

The meaning of data is so wide, we often try to gauge its nature based on the outrage with which the word is uttered

Some years ago, I heard a co-worker shouting at one of his team members: “You shared my personal data with a stranger without my permission. How can I ever trust you again?” His colleague was visibly shaken and apologised profusely. I thought he had shared the loud-mouth’s bank statement or itemised phone bill or something even more personal. However, it turned out he simply gave his phone number to a client. The meaning of data is so wide that we often try to gauge its nature and its importance based on the outrage with which the word is uttered. We often get fooled.

Economics 101 states that all of us face trade-offs. We give up something we like in order to get something else we like. We give up some of our data to get benefits from the government and from private players. It worked fine in the analogue world. But now we share more data, more kinds of data, from different sources. At the same time the ability of the companies to analyse these data has grown exponentially. So has their appetite for data, even as they realise that data is the new oil (check out Charles Assisi’s piece on the subject). These have important implications for our privacy.

When we think of data, we usually think in terms of structured data, pre-defined variables, numbers, date, text strings. For example, bank statements, expenditure statements, utility bills and so on. Most of these result from a conscious act. For example, when I make a call, I know its details such as the number I dialled and the time I spent on the call will be recorded. Or when I buy a product using my credit card, I know that it’s not the same as buying it with cash, and that my bank knows about the transaction, and even that this data will be shared with a credit rating agency like Cibil.

But, what we don’t instinctively realise is that it’s possible to create a fairly accurate profile by using this data. Some of the insights might come as a surprise to us. For example, back in 2008, when I connected my bank statements to a web application and looked at my spending patterns, I realised that I spent a lot more on eating out than I thought, and a lot less on books than I believed. By looking at my data, my bank can potentially know more about me than even I can. Just like a doctor can know more about me by looking at my diagnostic reports, and I would not know unless I find a way to read it.

So, this is the imbalance of power that we face today. A data controller (a bank, a telecom company, or someone who is in a position to consolidate these data sets) who has the tools to collect, organise, analyse and draw insights, can potentially know more about some crucial aspects of my life than even I can. In a typical Johari window that psychologists use, these details about my life, squeezed out of disparate data that went from me, would fall into the quadrant that says, “Known to others, Not known to me.”

But it doesn’t end there. Structured data is only the tip of the iceberg. There are humungous amounts of unstructured data that data operators—typically the big tech companies such as Google and Facebook—have access to: the mails we send, the messages we type, the pictures we post, the videos we share. In 2013, a Geneva-based cyber security firm High-Tech Bridge decided to test if these companies snoop on these messages through a simple process. They sent private messages with unique URL and checked who clicked on them. It was Facebook, Twitter and Google. We can’t complain—Google’s terms clearly state that it scans mails—but we really have a lot to worry about. As Jim Killock of Open Rights Group told The Guardian, “It is the amount of information they hold on individuals that should be concerning us, both because that is attractive to government but also sometimes that information leaks out in various ways like the NSA’s use of cookies in general as a means to target users.”

Some of us console ourselves saying that we have signed up to these services out of our own volition, and we can get out anytime. In theory, yes. But in practice, it is probably more difficult than what many of us think. A number of studies have shown that internet, digital and social media addiction is real. The addiction has biological and psychological basis. The tech firms have compelling commercial reasons to keep it that way, nudging users to spend more and more of their time on their platforms with more and more intensity. It might be a stretch to compare this with the cases of drug addiction, but to the extent that addictions are seldom voluntary, we need to be concerned.

But, it’s not just about our giving data to tech companies, and their nudging us to give more. It’s also about what they can do with the data. Take gigabytes or terabytes of my data, and let AI scan through them to draw insights about me, and then we are getting into the territory of what Donald Rumsfeld called “unknown unknowns”. In a well-known example, Facebook uses AI and pattern recognition to identify suicidal people based on their posts—to prevent suicides. It’s for a noble cause. But it also underscores the speed with which AI is advancing. Soon, if not already, the kind of insights and predictive abilities the tech majors gain about a person can be better and more accurate by an order of magnitude. When that happens, it will know our hidden motivations and trigger points better than we can. In a typical Johari window these insights will fall into the quadrant “Not known to me, Not known to others.” But the machines will know.

My data is mine. Sure. But what about the output of the analysis done with my data?

This raises serious questions even about data. My data is mine. Sure. But what about the output of the analysis done with my data (and when that analysis also had inputs from millions of others)? Can I claim the output to be mine too? When Google or Facebook say they delete my data on my request or share the data they collected about me with me, does it also include the insights they derived from crunching the data? They might even be justified in claiming those insights to be their own, but is that not about my privacy too? Suppose they delete the raw data, and I sign up again to those services a couple of years later, would they be able to add the old insights to my new profile?

These questions are important because even if I have all the data, I don’t have the ability to analyse it to know more about myself. Where are our Excel Sheets and R programming skills against the power of AI? Not only do they hold data that is probably very private to me, but I am also at a distinct disadvantage because using my data others will know more about me than I can. Imagine someone knows my medical condition that even I don’t know of.

These are some of the questions that Aadhaar does not even get into because it collects limited structured data. Aadhaar for all practical purposes is not voluntary. Signing up for Facebook, Google and Amazon is voluntary. But do I really know about the range of data they collect and do I really understand what they can potentially do with my data when I click “I Agree” on their terms and conditions? (It’s also in this context that Rahul Mathan’s argument that we have to look at a framework beyond consent gains importance.)

We will be fooling ourselves if we think the privacy case in the Supreme Court is all about Aadhaar

In short, in this age, we will be fooling ourselves if we think the privacy case in the Supreme Court is all about Aadhaar.

Such a view should in no way lead to a conclusion that Aadhaar is not without privacy and security risks. A pragmatic way to look at Aadhaar is to see it as a development project, but be fully aware that by its very nature of being a digital platform, it comes with its own risks. It needs protection from the excesses and the limitations of the government. These are real risks. Mitigating those risks through a combination of better laws, better technology and better communication will make Aadhaar-as-a-development-tool stronger. It is one reason why many involved in the project, including Nandan Nilekani, and those who are using Aadhaar for development, have been pushing hard for a robust data protection law right from Aadhaar’s early days.

Three questions are pertinent.

1. Do we have institutional mechanisms to protect citizens’ data?

Irrespective of what the Supreme Court says about privacy being a fundamental right, it’s important that the country passes a robust data protection law (in the first place) and institutional mechanisms to ensure that the laws are enforced. Fortunately, a lot of thought has already gone into what that law should be like. Mathan, for example, has worked on a draft back in 2008 and more recently published a discussion paper on why we need to look beyond consent to address the privacy concerns. In 2012, a committee under Justice AP Shah had presented a report which has listed nine privacy principles based on a study on privacy laws across the world. What we need is the political will to take these forward.

2. Do we have access to best in class technology and cybersecurity expertise?

Data security is not India’s problem alone. While the government has wrongfully gone after some individuals and organisations who exposed weak links in data protection, examples of security breaches from across the world suggest that the worst attacks are more likely to come from abroad. The question is, are we technologically prepared for it? Venky Ganesan, chairperson of America’s National Venture Capital Association, testifying before the senate, offered five suggestions that have some relevance for India too, and could be a good starting point.

Modernise government procurement systems so that the government has access to the best technologies. Set standards around cyber-hygiene. Enable legal frameworks for companies to share and exchange data. Create a generation of cyberwarriors. Use cyberinsurance to pool and minimise existential risk.

The full testimony is here.

3. Do we have systems in place to spread digital risk literacy?

Digital risk literacy has many elements to it, including the awareness that it could get addictive. One of its key elements though is the way we manage our own data. The so called ‘free’ economy is in fact a barter economy. We are trading our data for products and services various businesses offer. In effect, it means that we need to treat our data as we treat our money—except that it’s even more complex. We can wash our hands off once we give our money to others, but in the case of data, the burden actually doubles when we share. We still own it, we are still affected by it. That’s the way of digital goods.

Millions of Indians will be getting into this digital economy, and the rules of the game are difficult for digital immigrants to understand. (From my conversations with the digital natives, I am not sure if even they understand the risks.)

The risk literacy programmes led by Gerd Gigerenzer, a director at the Harding Center for Risk Literacy, a part of Max Planck Institute for Human Development, Berlin, show that it’s possible to spread risk literacy by better communication and by training programmes. Digital training is in fact a part of the Digital India programme, but by many accounts it is way far behind given the enormity of the goal.

The price of privacy, like its blood brother liberty, is also eternal vigilance

At one point in the 1998 movie Enemy of the State, the character played by Jon Voight says “Privacy’s been dead for 30 years because we can’t risk it. The only privacy that’s left is the inside of your head. Maybe that’s enough.” He plays the villain in the movie, and drives home the point that it’s a dangerous position to take. But, to assume that privacy is alive and kicking is also a dangerous position. Because the price of privacy, like its blood brother liberty, is also eternal vigilance. It needs protection. It needs to be kept alive. To do that, we have to pick the right battles.