* This post was co-written with my friend Salman Salamatian, a brilliant guy I met while I was in Pittsburgh and who worked at Technicolor this summer.

Or do you not think so far ahead, ’cause I’ve been thinking about forever? – Frank Ocean

The words, actions and thoughts of famous people are always scrutinized in ridiculous ways. The idea of the perfect person, the perfect Audrey Hepburn or the always cool Steve McQueen, more often than not exists only in the collective mind of society.

After becoming famous and important, whether people are politicians, actors, or Internet superstars, they become branding experts, thought curators of their own minds. No sentence goes unexamined, no bad deed unheard of, and no private conversation is ever really private again. Publicists are hired, and PR firms become a necessity.

However, all of these people get scrutinized mostly after they know they will be. As of yet, no presidential candidate has had his entire life up in a Facebook profile. None has had his late drunken 22-year-old tweets examined. I even wonder if political parties will discriminate against the very existence of a digital record of one’s life in the future. Maybe we will abandon the expectation of immaculately perfect individuals instead.

Not to play Captain Obvious here, but we must then be very thoughtful of what we post online. I believe few people have realized that someday our grandchildren may try to unbury all of our digital records, examining every single “like”, comment, or email we ever had. Thus, we should become curators from the very beginning, lest we regret it later on.

1. tl;dr – Most people are not aware that they are sharing too much information

Some people think that we should stop whining and come to the realization that privacy is dead. Luis von Ahn, for example, is vocal about privacy being a creation of the 20th century. Before the concentration of people in big cities, everyone in the village knew about you and your activities. It was the amount of people in big urban centers that gave the illusion of privacy; your actions became insignificant there. Therefore, the digital revolution is only taking us back to our past situation.

We, however, do not buy that idea. At least not right now. It will be a while until people lower their expectations, and the transition will always involve a lot of whining. Furthermore, if people demand privacy, privacy they shall receive; users should not adapt to us, we should adapt to users.

2. tl;dr – Privacy was a child of the 20th century; get used to not having it. Maybe.

This transparency in our lives is enabled by the systems the software industry creates. Our decisions as engineers to log the user’s activities for a period of time are important. With the rise of scrutiny against regular people’s lives by recruiters, school admission committees, and possible dates, the question of privacy in our online activities is regularly discussed.

For engineers, the issue is how to treat privacy in the systems we build, even if it’s just for a transition period.

For this, we must ask: what is privacy? We believe that it is not only the ability to hide information from people, but the ability to fairly use it. We don’t even really control what information we hide from whom in real life: whenever we share anything with anyone, we are trusting that person not to go and broadcast it to the world.

3. tl;dr – Privacy is not the ability to hide information, but the ability to fairly [re]use it.

Now, there is a big, nasty misconception between data and information. Data and information are two completely different things. To understand privacy, it is really important to understand how these two concepts relate.

What we reveal to our Facebook friends and fellow tweeters is not information; it is data. Whether or not this data becomes information is a different question.

We, however, tend to forget this distinction, simply because our subconscious does a heck of a job at transforming data into information. We see samples of data and we build a complex model of correlations to gather information from them. Thus, we suppose: eating deals with hunger, people that drive Ferraris are rich, and two protuberances in the chest are good indicators of a female. We are always building cause-effect correlations, and we never notice it.

Does this picture reveal any information to you?

It’s just data, isn’t it? In order for us to understand it, we must be acquainted with the work of Rene Magritte.

Thus,

information = data + metadata

That is, information is data in context (which engineers sometimes call metadata). Hiding context–identities, location, and etcetera–could be a sure-fire way of sharing data without revealing context. This, however, is a technical problem, a Very Hard Problem.

So, when we build privacy into our systems, it is not really about protecting what our users share (data), but about allowing information to be used fairly. Now, of course, the real challenge for us is to be able to differentiate between data and fair use of information.

4. tl;dr – Information is data in context. Data itself is not relevant. As engineers, we should care about protecting the use of information.

Since differentiating data from information is a hard technical problem and a long philosophical discussion, is the best way to manage privacy in our systems a technological one then? Or could we turn to, say, Economics?

In order to manage anything, one should first determine two things:

What is it that you are managing? – Fair use of information How do you know if you are managing it well? – By assuring that whoever is disclosing information receives fair compensation for it

Now, we must define what fair compensation is, and in order to do so, we should come up with a way to value information consistently: all economic players must agree on these valuations.

Now, valuing information is also a Very Hard Problem. We could, for example, report the estimated impact of disclosing information in our daily lives. With enough samples, we should be able to set a price to different types of information. This is a very simple proposition, and one that shrewd players would easily take advantage of. Given that our area of expertise is not Economics, we will leave this problem to the economists for now.

For the sake of this post, however, assume that such valuations exist. We should, then, build software systems that do not seek to hide information, but that make economically sound use of that information.

Let’s walk through a scenario from a Privacy-Economics-aware Facebook:

Mary gives her work information history to Facebook Facebook uses that information to allow recruiters to target Mary for job positions Recruiters contact Mary through a one-time-use channel in order to offer relevant positions Facebook gets k dollars for facilitating the information from Mary to the recruiters Mary receives free service from Facebook for N months (valued k dollars)

Here, fair use of information is managed via a one-time-use channel for communicating information, but fair use could also be achieved by other means, like information tracking. Information tracking would seek to limit how users redistribute information. Because of technical difficulties, information tracking, or access control techniques, have been unsuccessful in the past (DRM anyone?). We, however, do believe that it is possible to create nice implementations of this solution, given that it would benefit most people.

5. tl;dr – We should consider creating a microeconomics system to control the privacy of information. This system would rely on correctly assessing the value of any piece of information and providing means for fair transactions.

However we implement our systems, the point is that, as engineers, we should think of privacy as fair use of information from an economical standpoint and not from a purely technical one. Engineers should look at privacy from two different standpoints:

How do we reveal data without revealing information? How do we use (monetize) information in a fair way?

This take on information and privacy would:

allow companies to know how much they can expect to monetize information for.

allow companies to understand how to incentivize or disincentivize certain uses of their platform.

allow users to be responsible about what they share, knowing the benefits or damages they will receive by disclosing information.

make users aware of the need to be active curators from the very beginning.

Privacy in software systems as we have experienced it so far is a lost cause. We need ways to force the disclosure of private information for certain very specific things (medical information is the best example), and for the rest, make information worth distributing. Otherwise, companies will just make money off our backs and we will have more and more of those stories about people getting fired because of pictures they posted on social networks.

Until we build this next Privacy Economy, be mindful about your data.

6. tl;dr – A Privacy Economy would be a Nice Thing to Have. As engineers, we should strive to build it and lay the foundations in our systems to incentivize it.

—

If you like this, considering following us on Twitter @aggFTW & @salmansa