

Managing data

In this digital era

Yields ethical thought



by Tim Aaron

Aurnhammer Philosopher and Social Media Strategist

How incredible would it be to live in a world where scientists, using advanced sensor technology, could monitor the properties of every atom in the entire universe!? Gaining knowledge of each atom’s mass, direction, and velocity (and putting aside the uncertainty principle for argument’s sake), such scientists could, consequently, calculate the results of every physical event that would transpire in the future. Though monitoring physical phenomena at this level would obviously prove to be impossible, information management breakthroughs in the digital sphere promise profound insight into predicting the outcomes of future events based on data analysis. Now, keeping our thought-provoking physics example in mind, let’s use it to help us make sense of the widely discussed topic, Big Data.

Before we can begin to comprehend the notion of Big Data, let’s begin with something a little smaller, namely, Data. Data, in the abstract, is the raw, factual figures from which information and knowledge can be derived. It is everything from the images, videos, tweets, texts, tumbles, stumbles, up-votes, down-votes, and upside down-votes that abound in the social networking community to the numbers, percentages, sensor measurements, and surveillance recordings that exist in the realms of science, government, and business. So, what, then, is Big Data? It is a LOT of just that. With IBM’s estimation that the world creates 2.5 quintillion (a billion billion) bytes of new data every day, it is easy to see how there are data sets large enough to render average software tools futile in the venture of storing, processing, and managing big data. [1] To put the volume of data creation per day into perspective, Steve McKee, president of Mckee Wallwork Cleveland, writes, “all of the earth’s oceans contain 352 quintillion gallons of water; if bytes were buckets, it would only take about 20 weeks of information gathering to fill the seas.”[2] Interestingly enough, McKee is not using this analogy to stimulate conversation on technology’s effects on global warming but it makes you think. Rather, he is commenting on the gargantuan bulk of data that flows across servers and makes its way into databases before being interpreted and understood. Anyways, bytes or buckets, why are we so interested in taking on the challenge of organizing massive data sets in the first place? What practical application does the effective management of big data have?

In the same way that we postulated that monitoring every atom in the universe could allow scientists to calculate future outcomes of physical phenomena, sound and efficient management of big data could allow organizations in every industry to anticipate what is to come and enable them to make decisions based on statistical and comprehensive knowledge. Don’t believe that such management of big data would be possible? I think the $15 billion investment into data management and analytics by powerhouse data management firms such as Software AG, Oracle Corporation, IBM, Microsoft, SAP, and HP speaks for itself. So then, what specifically is to be gotten out of such a substantial investment?

One of the tried and true applications of developing big data management systems to more effective lengths is in the realm of marketing. Businesses using advanced data management systems that decipher and give meaning to big data are better able to assess their consumer base’s preferences and purchasing habits based on interpreting data of who, what, where, when, how, and even why items are purchased and used. In the recent past, software tools have been created that enable marketers to trace customers’ purchasing paths from their first internet search to their last, upon which the item is purchased.[3] With data being pulled from online marketplaces to social networking sites, businesses using big data management software have tailored and capitalized on competitive advantages, resulting in increased revenue and profit maximization. All sounds great, right? How, then, could the collection and analysis of individuals’ data stir up such controversy in the media?

Well, imagine if you had bought a wedding ring online and even before you kneeled to propose, every single one of your friends and acquaintances on Facebook, including your fiancé to-be, knew what was going to happen? What a way to ruin a surprise, right? Well, frighteningly enough, this is one of the many stories that came out of Facebook’s Beacon controversy. Failing to realize the ethical dimension of gathering and distributing data on people who, for all intents and purposes, believe they are operating in a “private-enough” environment while using the internet, Mark Zuckerburg created a program that tracked Facebook users’ activity on third-party websites and posted what they were doing on friend’s newsfeeds. After realizing the gravity of his mistake and the extent to which Beacon infringed on people’s right to privacy, Zuckerburg dismantled the program and apologized to users, but not before damage was done and issues were raised.[4]

While the management of big data has many practical applications that allow enterprises to determine trends and act upon as comprehensive a knowledge of activity in any given industry as is currently possible, we cannot forget that much of the data gathered is about people and that there are moral considerations that come into play with the handling of such information. Privacy infringement blunders such as Zuckerburg’s Beacon program demonstrate the misuse of big data management and although this has not yet pushed users away from using social networking sites, it has reiterated a point that many find unsettling while browsing online from the privacy of their own homes- the internet is a public space and online activity is being closely monitored; any data that one provides over the internet is accessible.

Because I began with a physics example, why not end with one that is something for big data management firms and data mining sites to consider. Meditating on the ethics of doing big data research, Danah Boyd, Senior researcher at Microsoft Research, writes, “The Uncertainty Principle doesn’t just apply to physics. The more you try to formalize and model social interaction, the more you disturb the balance of them.”[5]

Now that you have some food for thought, let us know what YOU think about any aspect of the concept that will revolutionize the way that data is both gathered and used. Only through collaborative discussion will we advance knowledge and achieve enlightenment.