Mountains of cash keep pouring into the titans of big data despite the world's inability to do much of value with their software.

To wit, in its last fiscal year, Cloudera pulled in over $260m, with analysts projecting over $330m for 2017. Hortonworks, for its part, yanked down $184m, up from $121m in 2015. Outside the open-source elite, Splunk keeps chugging along, driving roughly $1bn in its last fiscal year, up from $668m the year before.

Despite this impressive revenue growth across the board, however, each aforementioned company is haemorrhaging cash as they feverishly spend on marketing and sales. But why? Why do these big data darlings need to invest so much money telling their stories even as enterprises have been hooked on big data's potential to dramatically overhaul yesterday's businesses?

The answer may come down to one tweet from a Gartner analyst. The spoiler? Virtually no one has been successful with their big data projects. They're spending lots of money but having little success.

The future is data, but we're stuck in the past

And yet... hope springs eternal in big data land. Even The Economist, normally so prim in its prognostications, put data on its 6 May cover and declared it the heart of a fast-rising "new economy". "Data is to this century what oil was to the last one: a driver of growth and change. Flows of data have created new infrastructure, new businesses, new monopolies, new politics and – crucially – new economics."

This follows on the heels of other breathless pronouncements of so-called "big data" and its impact on the world. For example, in 2011 McKinsey & Co. declared big data as "the next frontier for innovation, competition, and productivity," citing hundreds of billions of dollars in value the effective harnessing of data could deliver. A few years later, GE looked to big data to drive "changes as profound as industrialization... in the late 1700s".

Intriguingly, the vast majority of software used to manage this ever-swelling pool of data is completely free. As Cloudera co-founder Mike Olson correctly pointed out: "No dominant platform-level software infrastructure has emerged in the last ten years in closed-source, proprietary form." Instead, open-source software like Apache Hadoop, Apache Spark, MongoDB, Apache Kafka, and more have arisen to manage the flooding of data.

Companies like Cloudera and Hortonworks subsequently arose to help mainstream enterprises put this otherwise complex software to work. As noted above, it's been a lucrative gig, with each company raising hundreds of millions of dollars and, in turn, generating hundreds of millions of dollars in revenue. What none of them has managed, however, is profit, and that's cause for concern.

Going broke on big data riches

Take a spin through the income statements of Hortonworks and Cloudera and they're both bleeding losses (though Cloudera's narrowed over the last two years). Even bullish financial analysts like Morgan Stanley's Sanjit Singh believe Cloudera is "still a few years away" from profitability despite leadership status in the industry.

Other vendors like Alteryx and Splunk eschew an open-source software approach, but have found it just as hard to get anywhere near profitability. This is a bit perplexing for, as financial analyst Nehan Choksi opines: "The differentiation of Splunk's product is... [higher] (i.e. easier to sell)" than Cloudera's by a significant margin, largely thanks to its clear proprietary packaging. Cloudera and Hortonworks have to sell against the possibility of customers using Hadoop, Storm, etc. for free. Splunk and Alteryx...? Not so much.

And yet, the profits elude them, and seemingly get farther away with each passing quarter.

Take Splunk. The company notched more than 358 deals in its last quarter worth more than $100,000 each. By chief executive Doug Merritt's reckoning, Splunk's total addressable market in big data is "absolutely enormous and our biggest issue is how do we cover all the interest and opportunity and generate that opportunity out there." Yet the company's net income last year was negative $355m. (Sure, the company reports it will be profitable on a non-GAAP basis, but that's basically like saying it's profitable in an imaginary world of its own vanity accounting methods.)

On the open-source side, Hortonworks also touted its success on its earnings call, declaring 25 per cent of the Global 500 enterprises among its customers, nearly doubling its $1m deals in the quarter to 12 from seven a year earlier. Indeed, similar to Splunk, Hortonworks declared that it's seeing "the market accelerating in many respects and on its adoption of Hadoop generally as well as then embracing the streaming data into their overall data architecture strategy," making for a bright, amazing, and still unprofitable future. The revenue keeps getting bigger but so do the losses.

As for why – well, it's worth looking into how often these big investments in big data actually reap big rewards.

Chasing the big data dream

According to Gartner, the answer is "rarely". As analyst Merv Adrian has called out: "Only 15 per cent of surveyed businesses report deploying big data projects to production." As for last year, that number was just a scratch behind at 14 per cent.

In other words, the money keeps pouring into the big data companies even as their customers generally struggle to figure out how to turn those investments into meaningful outcomes. These big data vendors then have to spend mountains of cash to convince would-be customers that this time it's different, that this time their investment will return "actionable insights" – that illusive dream of data scientists everywhere.

And yet...

One farmer who bought into the hype put it this way: "Everybody is still trying to figure out where the value in data is." If you think a farmer doesn't exactly represent the savvy CIOs of the world with that statement, think again: nearly every CIO survey highlights a lack of clear value delivered by big data projects. CIOs know that they're supposed to be rolling in big data riches, but they're still forced to swim in the shallow waters of sparse data.

Nor do we really need to ask CIOs through surveys: the labour data reveals a distinct lack of big data adoption. As Greg Ip has written, the belief that data-driven AI/machine learning will make businesses more efficient by replacing workers with robots is "baffling and misguided". "Baffling because it's starkly at odds with the evidence, and misguided because it completely misses the problem: robots aren't destroying enough jobs," Ip says.

Indeed, IDG Research nails it when it finds that "abundant data by itself solves nothing."

If anything, it exacerbates enterprise problems, drowning CIOs in a sea of data they're both ill-prepared to understand, saddled with corporate cultures that aren't suited to harness the data anyway. As for the data, IDG continues: "Its unstructured nature, sheer volume, and variety exceed human capacity and traditional tools to organize it efficiently and at a cost which supports return on investment requirements."

This isn't to say that big data is a big sham. It's not. However, enterprises are years away from getting full value from their data assets. Throwing cash at the problem isn't helping matters either. Companies need to scale back their ambitions to invest in projects that are more evolutionary than revolutionary in nature, looking to tweak rather than overhaul existing operational practices.

In short, the best big data strategy may be to go small. ®