by Sarah Childress

Every minute, the world’s 2 billion Internet users upload staggering volumes of data to the web: an estimated 200 billion emails, 48 hours of YouTube videos, 684,478 posts on Facebook. And then there’s Tweets, Instagram photos, text messages and blog posts.

But what happens when what seems like social media ephemera enters into the vaults of Big Data, where everything we do online, and increasingly offline, is collected and stored for analysis?

“Those who have access to and control the platforms have the largest, most powerful source of information about human behavior that anyone has ever had in human history,” Mark Andrejevic, deputy director for the University of Queensland’s Center for Critical and Cultural Studies in Australia, who studies surveillance and the web, told FRONTLINE. “And that’s information that they are going to use for their ends, whatever those might be.”

The biggest players in the data market are the U.S. government, with its vast web of classified programs; and commercial interests, including data brokers. The government’s goal is ostensibly to thwart terrorists. Companies, meanwhile, want to sell more products to the growing consumer market online, which is currently worth $200 billion for U.S. retailers alone.

Today, companies reach consumers with targeted marketing — placing ads based on your web searches, the content of your Gmail messages, or your purchasing history. Every time you log into a site like Amazon, the recommendations you see are based not only on your purchase history and items you’ve viewed, but also an algorithm that factors in the preferences of people with similar buying histories. Companies also use retargeting: blanketing the ad space with images of products you’ve viewed as you move on to browse other sites.

But as the volumes of data we add to the web keeps multiplying — one estimate projects it will grow 300 times by 2020 — it becomes more difficult for marketers to figure out who to target with which ads and when.

So companies are turning to computer models that analyze these massive pools of information to make inferences about your health, personality traits, and even mood in real time, in order to help them predict, and ultimately influence, your next purchase.

Data Knows You Best

It only took a few months for Michal Kosinski, deputy director of Cambridge University’s psychometrics center, to build a computer model that could tell him, with 60 percent accuracy — based only on a person’s Facebook “likes” — whether their parents divorced before they were 21.

It could also tell, with 88 percent accuracy, whether a man was gay.

The study drew on data from 58,000 volunteers. Kosinski asked them personal details — their sexuality, whether their parents had divorced, whether they had used drugs — and split them into 10 random groups. He built the model using the information from nine of the 10 groups and used that information to predict the traits of the remaining group. “We used very generic methods,” he said. “You don’t need access to secret data.”

Kosinski said the model was based on people’s Facebook likes, as well as their friends’ comments and tags. And it doesn’t only work for people who overshare on Facebook. “Often, the less active you are, the more informative is what you do,” he said, because that can suggest that those select interactions are more significant.

The study underscores the power that computer models have to find patterns where humans can’t. For example, the model found that people who liked “science” and “Mozart” tended to have high IQs, which isn’t so surprising. But so did people who liked “curly fries” and “Morgan Freeman’s voice.” Of course, there’s no scientific correlation between curly fries and a high IQ — and probably not Morgan Freeman’s voice, either. But among this group of participants, the model found a pattern that led to a fairly accurate prediction.

Kosinski’s study was small. Most predictive models draw on millions, or even tens of millions of people, which makes them even more accurate. Such models, he said, also can learn from their mistakes. When they get a prediction wrong, they adjust to try to be more accurate in the future.

“It’s Like Minority Report”

Companies today are building much more sophisticated models as they gain access to information beyond just what you post online.

Part of the new information comes from what’s known in the tech world as the Internet of Things. This is where physical objects — such as televisions, refrigerators, thermostats, pacemakers, roads and cars — are equipped with sensors that gather data and upload it to the web for analysis. That data is used for all kinds of reasons: helping grocery stores keep shelves stocked, improving waste management and traffic control, figuring out the next original series that Netflix should commission.

It can also also provide a wealth of data on individuals that the commercial sector can collect, analyze and combine with your online activity to market to you more directly.

For example, a company might photograph customers when they enter a store and again at check out, to know what they look like and how much time they spent in the store. They could then combine that information with the customer’s purchasing history, zip code, demographic and other data that can predict what the customer will buy next. When the customer returns to the store — and is recognized by the cameras — he might receive a text with a coupon for his next purchase or a much more personalized interaction with the store’s employees.

“We’re able to see this kind of 360-degree view of an individual,” explained Jeff Eden, the principal at DEG, an agency that designs digital marketing strategies for companies. He added: “It’s like Minority Report. We’re there.”

Companies can use this data to map out the thousands of decisions a consumer might make next — likely purchases, for example — and use that information to influence them to do what they want.

“Everyone’s not as unique as people think they are,” Eden said. “Especially when there’s 10,000 other people doing the same thing.”

Data Knows How You Feel

Soon it won’t be enough for companies to know what you might want. The most sophisticated want to know when you’re most likely to want it.

Last month, Apple submitted a patent application for technology that could make inferences about people’s mood in real time. “If an individual is pre-occupied or unhappy, the individual may not be as receptive to certain types of content,” Apple said.

The company’s solution: Figure out how a person is feeling at any given moment, and target content — ads — to be delivered at the right time and place. To do this, the company could establish a baseline profile of each user, and then make inferences about a person’s mood based on deviations from that baseline.

Studies suggest that many people already unwittingly reveal a lot about mood and behavioral characteristics online. A team of researchers at Microsoft Research in Redmond, Wash. used Twitter to predict major depression with 70 percent accuracy in users by analyzing the time, content and frequency of their tweets. In another study, three researchers from that team predicted with 71 percent accuracy whether a woman would develop post-partum depression based on her online behavior before she delivered her child.

Apple said its tech would combine behavioral indicators — such as the rate of clicking, likes and comments; the order in which users open applications; and the date, time and location of their online interaction — with physical indicators gathered from wearable technology that can measure heart rate, blood pressure, adrenaline levels, perspiration rate, body temperature; voice level, pattern and stress; and movement and facial expression.

Are You Creeped Out Yet?

Digital marketers say that all of this data collection and analysis is designed to improve the experience for the consumer by offering more of what they want, or at least what companies think they want.

“[Consumers] are searching for emotional connections to companies, not just transactional relationships,” said Michael Lazerow, the chief marketing officer of the Salesforce ExactTarget Marketing Cloud, which helps companies streamline their digital marketing strategy. He added: “The value of targeted marketing isn’t just from the marketing message itself, it’s from the intelligence and optimization” — data — “with each interaction that builds a one-to-one relationship with each customer to increase brand loyalty and drives sales.”

Salesforce, for example, can help companies offer personalized product recommendations for users who visit the company website, or daily deals based on a user’s travel history and hotel preferences, according to its promotional videos. Or they can text a coupon to a customer’s phone when they come within range of a company store.

Privacy advocates worry that consumers aren’t getting fair deal for their data. “The idea that any of this actually matches people’s expectations is that just wrong,” said Lee Tien, a senior staff attorney at the Electronic Frontier Foundation, a nonprofit group that advocates for individual rights online. “It’s not a social contract. … It’s working for the commercial arena, but it’s leading to more and more collection of data, and there’s a lot of potential for mischief.”

He added: “There’s a butterfly effect that we don’t see. And one that the business world, for the most part, has no interest in helping us see.”

Tien points to security breaches, like the recent hack of Target’s customer data, as an indication that not all companies are responsible with the data they collect. The lack of transparency into what’s being collected, and by whom, he says, makes it more difficult to hold companies or individuals responsible for what they do with our data — or even to know how much data is out there and whether it’s accurate.

Some companies have already seen a backlash when the curtain is drawn back on their data collection. A few years ago, Target began analyzing purchase histories to determine when women were pregnant, and then sending them ads for baby products. Then, one of those fliers went to the home of a high-school student whose father didn’t know she was pregnant. Target apologized, and later scaled back its advertising to make the fliers look less conspicuous.

The Federal Trade Commission has brought several cases against companies for violating laws protecting consumer privacy, which include prohibitions on targeting children online.

But government regulators have struggled to catch up to the fast pace at which data analysis is evolving. Last month, Jessica Rich, director of the FTC’s Bureau of Consumer Protection, called for updated privacy and data security legislation to help “level the playing field” for consumers.

“We do have so much information that we could, absolutely get creepy,” Eden said, adding that Target’s flier was clearly a “miss.” “You have to be responsible with that information. … It’s a delicate dance. For most of our customers, for our clients, we’re hired to drive results. So it’s a barometer that we have. We have to be good stewards of their brands. But at the same time, if they invest a dollar in digital, they’re looking to get five dollars back.”