“ Data-Driven Thinking” is written by members of the media community and contains fresh ideas on the digital revolution in media.

Today’s column is written by Michael Mallazzo, director of marketing at Narrativ.

The data that powers the bulk of programmatic ad spend can only identify if a user is male or female about 50% of the time, according to an impeccably thorough report by Nico Neumann at Melbourne Business School.

In the eternal quest to figure out which “half of my ad budget is wasted,” we may want to start here. Neumann’s team estimates erroneous data costs advertisers $7 billion annually.

In 2014, Oracle paid roughly $400 million for BlueKai, a platform that pegs me as a married homeowner with two children who is interested in subcompact cars, rap and hip-hop, hunting and golf. I’m single, rent a Brooklyn apartment, proudly blast Springsteen and have never owned a car. And I hate golf.

After a cruelly ironic registration process that forced me to fork over my personal data to access my personal data, Acxiom’s abouthedata.com fared slightly better. It correctly identified me as male and provided some correct generalities, such as the killer insights that I’ve purchased apparel and food.

Abouthedata.com says that Acxiom helps companies “use data in responsible, ethical ways to create personalized experiences,” a refreshing tagline for these times. But personalization can only be as effective as the data that powers it.

So how did we come to accept the validity of flawed data?

Venture capitalists poured money into third-party data startups, which seemed to suggest the data was legit. Then large companies allocated massive budgets to programmatic advertising based on this data, which would also seem to suggest it was effective. And then big marketing clouds went shopping, further suggesting that the obscure periphery of our internet history data is worth hundreds of millions of dollars.

But what if large sectors of the data industry grew without their theses being fundamentally validated?

Of course, there is one company that understands the superfluousness of all this data: Amazon. To Amazon, you are what you buy. Nothing more, nothing less. Cambridge Analytica-type data is fundamentally meaningless to Amazon because it is less powerful than the first-party data it can provide to advertisers about shoppers. As Amazon prepares to eclipse $10 billion in ad revenue this year, the correlation between Amazon’s data to what we buy is why Sorrell lost the most sleep over its market entry.

In a macro sense, the totality of data mining fundamentally can’t match the simple power of contextual targeting that powers Google and Amazon. And Amazon’s targeting is probably more powerful in the long term. At the end of the day, people telling an advertiser what they are searching for is still a lot more accurate than the best artificially intelligent guess.

The irony here is that the current apparent ineptitude of data providers actually protects the privacy of internet users. Digitally savvy consumers overwhelm ad tech algorithms with so many data points that they become difficult, if not impossible, for data brokers to distill in any meaningful way. This "unknown" audience becomes less valuable to advertisers and enjoys a superior internet experience with their privacy intact.

This is the hallmark of a broken market. Advertising is effectively supposed to be a tax that we all pay to enjoy free services that have no business being free. But the externality of digital advertising is being disproportionately picked up by the subset of individuals firms can merely caricature. This should inspire deeper soul searching in the industry.

Follow Narrativ (@hellonarrativ) and AdExchanger (@adexchanger) on Twitter.