John C. Havens is the founder of The H(app)athon Project and author of the upcoming book H(app)y — The Value of Well Being in a Digital Economy (Tarcher/Penguin, 2014). He can be reached @johnchavens.

“The greatest trick the devil ever pulled was convincing the world he didn’t exist.” —Kevin Spacey as Verbal in The Usual Suspects.

It’s good that big data is so vast and mysterious. It’s good that the sheer volume of information that’s part of the big data landscape is almost impossible to comprehend. It’s good that, according to a recent Gartner poll, big data is falling into the trough of disillusionment.

All these things are good because big data wants you to forget it exists. Look the other way as algorithms begin writing themselves without human input, codifying the biases of original programmers with technologies that will soon surpass human sentience.

SEE ALSO: 11 Big Tech Trends You'll See in 2013

It’s easier to let disillusionment with data inspire inertia than work to fully tame the binary beast.

But if we don’t view data science via a framework based on sound hypotheses, we’re a sorcerer’s apprentice sweeping code in haphazard directions to take a life of its own. The interconnected systems of our technically enhanced planet are making decisions without coordinated ethical or human guidance, which leaves us two choices: dystopia or discipline.

Context Over Code

“My biggest fear is that data science is used as a blunt tool and that people don’t understand the cultural implications of quantifying our world,” says Jake Porway, founder of DataKind, an organization that brings together leading data scientists and high impact social organizations through a comprehensive, collaborative approach that leads to positive action through data. He’s as much a data philosopher as scientist. Gifted at navigating the channel between hacking and hypothesizing, he is adamant about helping people understand how to create context as well as code for insights based on big data. He says, “This due diligence should be embedded in our craft.”

For the industry to evolve, it has to move beyond the debate of domain versus data science expertise. Porway's data scientists team up with departments in outside organizations, for example, in a recent project involving DC Action for Children. The non-profit focuses on understanding childhood well-being and sought DataKind’s help to build an interactive visualization that would allow it and policy advocates to navigate government indicators in one coherent fashion. Data designers from the Gates Foundation and the Washington Post rallied to build an interactive map that was eventually unveiled to the mayor of Washington, D.C. and will now be used for public policy change.

“DC Action for Children had a cause, childhood well-being, that they wanted to improve with data," says Porway. "The data scientists didn’t come in with arrogance but worked with the group as a team. Now it’s a piece of software that accelerates how the group achieves their mission of helping kids.”

From Siloes to Singapore

Image via iStockphoto, scanrail

“Big data is just a marketing term. We’ve had problems understanding how to organize large pools of information for the past 20 years,” says Stewart Townsend, business development director EMEA at Zendesk. “But now organizations really need to get smart about removing siloes from their data to extract the best overall value for their work.”

As an example of the power of these merged data sets, he cites the work of Palantir, a Virginia-based company with a holistic approach that helped the DEA track down a Mexican drug cartel member who shot a U.S. drug enforcement officer. “If you’re not working to bring your data sets together, you’re jeopardizing your future,” says Towsend. “It’s a direct business angle.”

Along with his role at Zendesk, Townsend is one of the founders of Big Data Week, a global event taking place April 2013. It launched last year in over 20 cities with over 2,000 attendees. Townsend and fellow founder Carlos Somohano created the organization to bring together communities focused on big data. Somohano is also the founder of Data Science London, which created the Data Science Academy to further spread education around these issues. Data Science London launched the first ever Global Data Science Hacktahon in partnership with Big Data Week and Kaggle. The hackathon challenged 212 data scientists in 10 cities around the world to create an early warning system for air quality control.

Both Townsend and Somohano feel the industry needs to address the issue of anonymizing data. “There’s a group in the UK focused on using non-personally identifiable data as a standard,” notes Somohano, “but the challenge with that approach is the potentially significantly less meaningful insights you can extract that can still lend value.”

Added to this ethical challenge are the complexities of dealing with natural language processing techniques and how big data is perceived in multiple markets. To help forge new paths along these lines, Big Data Week is going east this year, to locations like China and Korea. “We want to understand the challenges in those countries and cultures around data,” explains Townsend.

Data by Design

Design-driven data, however, is flourishing in the east, as exemplified by cities like Singapore. “Singapore is our model of a city where a living environment that is constructed to optimize beauty and efficiency is nonetheless built entirely on analytics and statistics,” says Roger Wood, founder of The (ART + DATA) Institute, which examines how data insights are used to make better products and experiences, which in turn, are designed to collect more useful data.

While planning for Singapore was pre-meditated, it was based on people’s actual experiences in city settings. Green spaces "appear" in the city where they feel most natural, and street layouts are designed via analytics, versus haphazard cow paths of old.

One encounters a similar synthesis of design and data in properties like the New York Times and most modern websites. Articles appear to readers based on a statistical analysis of a piece generating interest, versus the whim of an editor.

This design element marks another critical component needed for big data’s evolution: understanding how the value of a user can be best served in a seamless, if not invisible, fashion. Wood notes, “In the future, you won’t understand that products are collecting data about you — it will be invisible.” He cites Zipcar as a company to watch — it has recently improved mobile capabilities for users. He envisions a scenario where a future Zipcar app will only recommend nearby cars that match your preferences, rather than simply a list of vehicles in your zip code.

“Depending on where you are and when you ask the question, the app will be intelligent enough to know what you want,” says Wood. For instance, the app won't recommend a smaller car if it’s a Saturday and you enjoy mountain biking.

SEE ALSO: How Big Data Can Make Us Happier and Healthier

Wood also feels these types of experiences could expand into people’s social lives; a matchmaking scenario would combine your preference data with your immediate context to optimize an otherwise traditional blind date. As Wood notes, “Data doesn’t lie.”

“By creating a context around lifestyle design, we’re enabling a world where anybody can find data relevant to them,” says Martin Blinder, founder of Tictrac, a platform that aggregates apps to help create and manage user projects to better manage their day to day lives. In the example of Singapore, the power of the platform comes from helping users analyze the behavior they’re already focusing on and providing a context for moving forward.

What’s unique about Tictrac, however, is it doesn’t want to function as a quantified self app, but rather, essentially become the iTunes marketplace for all "life project" apps. The company has already brokered deals with multiple partners to help achieve this goal, and its platform also syncs with over 40 APIs. This means users don’t need to grant permissions to dozens of apps to create a life project on the platform — a valuable time saving element.

Functioning as a data dashboard, Tictrac will eventually begin to learn a user’s behavior to personalize information. So if you’re focusing on improving your diet, the system would tell you to eat more protein.

Blinder believes this type of evolved adaptation will help people design their lives in the future. He says, “I feel we’re reaching a point in history where we can really empower ourselves based on an understanding of our own data.”

The Priority of Privacy

Image via iStockphoto, Mari

Edd Dumbill is a principal analyst for O’Reilly Radar, and program chair for the O’Reilly Strata Conference. The conference, launched in February 2011, is thought to have originated the term "data scientist."

Dumbill notes a number of trends for 2013, but the subject of privacy is an especially hot button issue that needs to be addressed sooner than later. “When the car was invented, a lot of people died until the seat belt was invented," he says. “The industry has to arrive at a level of self-regulation or it will get regulated by people who don’t understand what they’re doing.”

But unlike most discussions around big data privacy hinged on philosophical debate, Dumbill provides a welcome pragmatic insight to the mix. “We simply don’t currently have the technological infrastructure to manage privacy.”

Once you have a user’s data, you not only have to manage it but store it and create systems to identify who has touched it. It’s the technological reasons surrounding privacy for most organizations that keep them from fully leveraging big data. This is why vertically integrated data solutions will provide the ability for people to make better decisions. This integration will help us move beyond a granular scrutiny of the technology behind big data and focus on the massive insights it can bring when managed by a team of connected experts.

In regards to the future, Dumbill notes, “I would prefer not to look at the phenomenon that’s feeding the IT trend of big data and focus on our increasingly programmable world.”

The Rise of the Personal Data Economy

Another danger with big data is focusing on the granularity of specific implementations before helping users understand an overall paradigm for how advanced analysis will look in the enterprise.

“It’s more of an economic discussion than a technology discussion,” says Brandon Barnett, director of business innovation at Intel Corporation. Barnett works in conjunction with Ken Anderson, senior researcher at Intel Labs, and Eric Berlow, founder of Vibrant Data labs, on We The Data, an initiative designed to utilize complexity science to create the portrait of a future data economy that doesn’t currently exist. As their site explains, “How do we balance our anxiety around data with its incredible potential? We the Data can create new forms of social cooperation and exchange, or give us more of the same corporate obsession with better targeted advertising.”

Note this language isn’t about demonizing marketers — it’s about shifting the conversation. We can't think of the ROI of innovation in quarterly terms, but rather, determine how the full breadth of big data can be understood within our current business environment.

Barnett notes, “The data economy is one example where our research suggests the world is moving, and yet traditional tools of market analysis fall short in identifying the products, services and business models that companies can pursue to begin to add value to the consumer.” This is why We The Data utilizes complexity science applied to business ecosystems to generate information about an economy that could be created. The goal of the work is to catalyze the imaginations of business leaders to envision a future where a unifying infrastructure that makes sense in the big data world doesn’t yet exist, but could.

We the Data also addresses the fundamental issue of privacy by creating awareness around treating personal data as a resource. Today we primarily utilize our data as a transactional tool leveraged in the identification and purchasing of products. This limited landscape will soon evolve, Berlow notes: “The anxiety towards privacy is real, but what is much more difficult to communicate are the potential opportunities to improve the lives of everyday people – not just if we can address the issues of privacy and trust but because we address them.”

As examples of these opportunities, he cites the categories of health, open government data for citizens and micro-entrepreneurship. “Imagine what we might learn if more people shared their health data because they could easily control other aspects of their anonymity? Imagine if small entrepreneurs had easy-to-interpret access to market research data that is now only available to large corporations?”

We the Data provides an important model to follow as the personal data economy becomes codified over the next five to 10 years. As people begin to understand the power their data holds beyond commercial transaction, openness and value-creation can help shape the business landscape of the future. Our new focus will be on the word “share” versus “holder.”

The Global Human Being

Through the Internet, we are developing a species-level nervous system, capable of transmitting thoughts, ideas and information. The resulting meta-organism — this ‘global’ human being — is also beginning to exhibit physiological reactions and even ‘higher’ human traits like empathy and compassion.

The author of this excerpt is Jonathan Harris, the artist, and co-creator of We Feel Fine. He is also quoted in The Human Face of Big Data, a multimedia book/app experience created by Rick Smolan and Jennifer Erwitt. In the book, lush visualizations and photos document dozens of existing scenarios for readers to fully understand the context and breadth of the existing big data era.

It’s not always pretty, though. “I’m worried that the people spending the most time thinking about big data are corporations and governments," notes Smolan. "Our legislators are not being informed about where this tech stands right now and where it’s heading.”

In the book, stats like "Facebook has 955 million monthly active accounts using seventy languages" and "one in three children born in the United States already has an online presence" paint a portrait of how pervasive data has become in our lives. Then you discover the massive positive discoveries from companies like Proteus Digital Health, which have created ingestible event markers (EMIs) with hardware that provides ongoing real-time data from within a patient’s body.

The overarching effect of big data is we’ll soon be able to perform real-time polling of the human race, or literally listen to the heartbeat of the planet. But there’s no ethical framework guiding this massive expanse of data infrastructure.

“I’m not worried about privacy — I’m worried about proclivity,” Smolan says. “There is no set of laws that says, ‘This is how you should be judged.’ Algorithms are writing algorithms and we’re losing the connection of what guides the technology that’s beginning to govern our lives.”

The Devil’s Had His Due

It’s wise to see software not only as a product or service, but also as the staging ground for humanity’s future, affording us the time and space to get our ethics right before the stakes are raised. Big data is powerful, but ethically neutral — we have to choose how to use it.

Above and in his "Data Driven" essay, Jonathan Harris explains that, while big data is ethically neutral, it can be used like any tool: for good or evil. If two people look at a stick, one sees a lever and the other sees a weapon. But without an ethical framework to guide the context of what people are looking at in the first place, even the question “What do you see?” doesn’t make sense.

So in the end, big data isn’t the devil. We just need to focus on issues like collaboration, context and design to guide our analysis, or give way to disillusionment and dystopia.

In other words, the devil is in the details. So let’s work them out before he does.

Homepage image via iStockphoto, Henrik5000