The word “Big Data” had rapidly transitioned from buzz word to reality, not only the big giants, like Facebook or Yahoo, even small companies have started adopting this technology and trying to predict the future of their business, demands and needs.

With Big Data Coming to Reality – Now What?

Decision making was a “rear view mirror” activity viz. Business Intelligence, looking at the past events that had already occurred and responding accordingly. But with increasing demand and the ability to analyze vast amounts of Big Data in real-time, decision making has now become a forward-looking event with the help of data scientists. Business executives can now see what is going on with the inventory, sales orders and information from sensors in real-time. Systems and Operations personnel can use big data analytics to infer terabytes of log files and other machine data looking for the root cause of a given problem.

How to Build a Big Data Environment?

An infrastructure that is linearly scalable and yet easy to administer is pivotal for a Big Data platform.

The primary challenge on building a big data environment would be, “Where?” Most of the organizations are chalking the pros and cons between the choices, On-Premise vs. Cloud Service. One of the understandable dilemmas for the organizations is the data leaving the premise, if the choice were to be Cloud.

#1. On-Premise:

This is one of most sought option for various organizations, mainly considering the sensitivity of the data leaving the premise. Some of the challenges faced with this choice are:

Initial capital investment to setup the infrastructure without fully knowing the scale

Integrating the Big Data Infrastructure with the existing backend infrastructure

Getting skilled big data resources to setup the infrastructure from scratch

Cost incurred with administering the infrastructure and availability

#2. Cloud Service:

With the uncertainty around scale and value, Cloud service has been a wise choice for many organizations. Amazon’s Elastic Map Reduce (EMR) and Microsoft’s Azure HDInsight have pioneered in hosting big data infrastructure on the cloud. However, cloud service comes with a trade-off of having the data leave the premise. Many organizations are sensitive about having the customer data leave the premise due to repeated cyber-attacks and privacy protection. However, the journey towards big data is often involved with prototypes and proof of concepts. Cloud solution comes really handy in such a case to be elastic.

Apart from the “where” part of hosting big data, the “what” part of the infrastructure is equally critical; Is it just storage? The organizations moving towards big data are often confronted with high velocities of data and varieties of data – structured and unstructured data and massive volume of data. Some of the infrastructure challenges include: