This blog represents a high level view of career opportunities that are existing in the Big Data Domain and basic skill requirements.Some of the designations and responsibilities are mentioned here.

Role – Data Scientist

The big data scientist needs to be familiarized with some of languages among Python, R, Java, Ruby, Clojure, Matlab, Pig or SQL.

They need to have an understanding of Hadoop, Hive and/or MapReduce.

In addition need to be familiar with disciplines such as: Natural Language Processing: the interactions between computers and humans; Machine learning: using computers to improve as well as develop algorithms; Conceptual modeling: to be able to share and articulate modelling; Statistical analysis : to understand and work around possible limitations in models; Predictive modelling: most of the big data problems are towards being able to predict future outcomes



Role – Big Data Engineer / BigData Developer / BigData Architect

Step by step approach for a software Engineer who is expert in Java / C / C++ => HADOOP (APIs, MR Coding, Ecosystem & Admin ) => HIVE/PIG/IMPALA/ML => OOZIE Plus Monitoring.

Architect, Design & Develop BigData based software from scratch / Upgrade / Maintain.

Step by step approach for a software Engineer who is expert in ORACLE / PL/SQL/ MS SQL / TERRADATA / DATA WAREHOUSING => HADOOP (APIs, MR Coding, Ecosystem & Admin ) => HIVE/PIG/IMPALA/ML => OOZIE Plus Monitoring tools.

Architect, Design & Develop BigData based data ware house

Role – Big Data DBA

Design and Development of Data modelling.

Hadoop ecosystem installation and configuration

DR / Cluster to Clysters – Database backup and recovery.

Database connectivity and security.

Performance monitoring and tuning ; Configuration based

Disk space management.

Software patches and upgrades for Unix as well as Hadoop

Role – Big Data Admin/Hadoop Administrator

Good Linux and shell Scripting background

Good knowledge of Hadoop Ecosystem and technologies.

Understanding of Hadoop design principals and factors that affect distributed system performance, including hardware and network considerations.

Experience in providing Infrastructure Recommendations, Capacity Planning and develop utilities to monitor cluster better

Experience around managing large clusters with huge volumes of data

Experience with cluster maintenance tasks such as creation and removal of nodes, cluster monitoring and troubleshooting. Manage and review Hadoop log files?

Experience installing and implementing security for Hadoop clusters.

Role : BigData – Hadoop operations / Production Support / Operations

Good Linux and shell Scripting background

Good knowledge of Hadoop Ecosystem and technologies.

Cluster maintenance

Job Management / Job failures / Investigation / Restart

Autosys / Oozie integrationData analysis – Data recovery

Cluster to Cluster data movement

Escalations

Operations management.