Special Feature Going Deep on Big Data Big data is transitioning from one of the most hyped and anticipated tech trends of recent years into one of the biggest challenges that IT is now trying to wrestle and harness. We examine the technologies and best practices for taking advantage of big data and provide a look at organizations that are putting it to good use. Read More

The Apache Hadoop ecosystem received a bit of a shakeup this week, with IBM's decision to eventually phase out its own BigInsights service in favor of standardizing on Hortonworks' Data Platform.

See also: IBM and Hortonworks go steady with OEM deal

Here are the key details from their joint announcement:

Hortonworks will resell the IBM Data Science Experience with HDP, a leading Hadoop distribution, and adopt it as its strategic data science platform, giving developers a fast on-ramp to data science capabilities including machine learning, advanced analytics and statistics. Also, Hortonworks and IBM will create new solution bundles that integrate HDP with IBM Big SQL, IBM's SQL engine for Hadoop, giving Hortonworks' legions of clients and users a familiar method of managing their data. IBM is adopting HDP for its Hadoop distribution and will fully integrate it with Data Science Experience and Machine Learning. As a result, this solution will combine for users the rich data security, governance and operations functionality provided by HDP, and the advanced analytics and management of the Data Science Experience. IBM will migrate existing IBM BigInsights users to HDP.

"It's no surprise that IBM is standardizing on HDP, as both distributions are based on the Open Data Platform Initiative standard and BigSQL was the only significant point of differentiation for IBM," says Constellation Research VP and principal analyst Doug Henschen.

"This is a significant opportunity for Hortonworks as IBM has hundreds of customers on BigInsights, either on-premises or in the cloud," he adds. "If Hortonworks picks up even half these customers as IBM starts offering HDP for on-premises deployments and as the basis of its cloud offering, Hortonworks stands to gain significant share in the big data platforms market."

Beyond gaining IBM's existing customers, Hortonworks will benefit from Big Blue's vastly larger sales force and longstanding relationships in large enterprise accounts.

Notably, BigSQL and DSX won't part of Hortonworks' HDP, which is open source. Rather, they'll be optional bundles, ones that provide important new capabilities for HDP. "BigSQL gives Hortonworks an ANSI-SQL compliant SQL-on-Hadoop option akin to Cloudera's Impala offering, though it also handles relational sources and object stores, while DSX is comparable to Cloudera's recently-released Data Science Workbench," Henschen says.

"In the fast-paced open source big data market, Hortonworks and IBM have found each other, complementing their database and data science platforms," says Constellation Research VP and principal analyst Holger Mueller. This is likely a win for joint customers, but for customers on other Hadoop distributions it's a point of concern: Do they move the data to HDP or stay where they are?"

One might also wonder how Microsoft will react to the deal, given that its own HDInsight service for Azure is based on Hortonworks.

Still, this week's announcement is a natural progression for Hortonworks and IBM, given their existing partnership and status as founding members of ODPi. Mueller also points to another key partnership between IBM and Hortonworks this week, regarding support for the latter's DataFlow streaming engine on IBM Power Systems.

"Streaming data is a performance critical use case, and it's likely streaming vendors-- this case Hortonworks HDF--can reap performance and TCO advantages working closer with hardware vendors like IBM," he says.