Introduction

This learning path provides a short but intensive introduction to this topic. The path is divided into three parts. In part 1, we learn general programming practices (software design, version control) and tools (Python, SQL, Unix, and Git). In part 2, we learn R and focus more narrowly on data analysis, studying statistical techniques, machine learning, and presentation of findings. Part 3 includes a choice of elective topics: visualization, social network analysis, and big data (Hadoop and MapReduce). Choose from any or all of them to enrich your understanding and skills.

The course consists of free online lectures, homework assignments, quizzes and projects, and will take around 350-400 hours to complete. There will also be a capstone project at the end that you can use to demonstrate your skills to potential employers or for a school application. This is an intensive path with a lot of material to learn, but at the end, you will know all the tools and techniques you need to start analyzing data: how to manipulate data, apply statistical and machine learning techniques, and visualize results. You'll be prepared to begin a career in data analysis.

Why learn this?

Data analysis is both a fascinating topic in itself and a tool that lets you make powerful inferences and understand the world around you. The techniques you will learn will help you accurately characterize data using models and then make inferences and decisions. If you enjoy applying math and analytical thinking to practical problems, this course is for you.

There has also been a large spike in demand for data analysts, so learning analytics can be extremely advantageous from a career perspective as well. Being able to find trends in large datasets will help you know how to make sound decisions in all aspects of your life.

What will you learn?

This free data analysis course teaches some of the most important techniques and tools necessary to manipulate and analyze large datasets and summarize conclusions. This includes:

exploratory and predictive statistics

basic computer programming in Python

more advanced computer program design

an introduction to algorithms

R for statistical analysis

practical machine learning techniques

Unix and Git

data visualization best practices

Finally, there are three optional elective tracks: Visualizing Data, Analyzing Social Networks, and Big Data: Hadoop and MapReduce.

What won’t you learn? Analytics and data science are enormous and burgeoning fields with many areas of study, and we will not have time to cover them all. In the interest of getting you analyzing real datasets as quickly as possible, the emphasis in this path is on practical applications as opposed to theory. Furthermore, while significant math is required in this path, we will not be covering the theoretical basis for statistics or machine learning. Also, the focus will be on analysis and manipulation of data rather than setup and storage. Some advanced statistics topics such as time series and Bayesian methods will not be studied in this path. Finally, some specific topics such as natural language processing and computer vision will not be covered.

Who is this data analysis course for?

This course is intended for people with little to no background in data analysis and computer programming. An introductory statistics class and an introductory programming class will both come in handy, but are not necessary. A basic familiarity with calculus and general computer competency is assumed.



For those looking to learn more, check out our free mini course on learning Python for Data Science or our Intro to Business Analytics mini course.