This class is designed to introduce you to data science using the Python programming language. Through this intense, daylong program you will become familiar with the tools and concepts to manipulate, visualize, and explore datasets to extract valuable insights.

Expert Instructor

This course is taught by Ted Petrou, an expert in exploring data with Python through Pandas. He is the author of Pandas Cookbook, a thorough step-by-step guide to accomplish a variety of data analysis tasks with Pandas.

Is this course for you?

This course will provide you with a solid foundation and resources so that you can smoothly transition into the world of data science. No prior programming experience is needed as a thorough introduction to Python will be given in a pre-course assignment.

After setting up our programming environment, a high-level introduction to the most common data analysis tasks will be given. By the end of the course, you will a complete guideline on how to successfully perform a data analysis.

Small Class Size

This is a small class with at most 10 participants, so you will be able to ask and get help with specific questions quickly.

Discount with Next Class

Take the Applied Machine Learning with Scikit-Learn class the next day on August, 26 and get $25 off with code double25.

When

Saturday, August 25, 2018: 9 a.m. - 5 p.m.

Syllabus

Before the Course:

Those students with no Python programming experience will need to allocate 20-30 hours to set up the programming environment and to complete a thorough overview of the fundamentals of Python.

Part 1: Data Science Environment Setup



One of the most frustrating aspects of getting started with programming or data science is setting up an environment on your personal machine that reliably allows you to begin a data analysis without going insane. We will install the excellent Anaconda Python distribution and develop programs in both Jupyter Notebooks and the PyCharm IDE.

Part 2: Introduction to Pandas - Selecting Subsets of Data

Pandas is an extremely popular data analysis tool with the DataFrame being the primary container of data. One of the most fundamental and critical tasks in pandas is selecting subsets of data, which we will do in a variety of ways.

Part 3: Split-Apply-Combine

Insights within datasets are often hidden amongst different groupings. The split-apply-combine paradigm is the fundamental procedure to explore differences amongst groups within datasets.

Part 4: Exploratory Data Analysis

Exploratory data analysis is a process to gain understanding and intuition about datasets. Visualizations are the foundations of EDA and communicate the discoveries within. The Seaborn library works directly with tidy data to create effortless and elegant visualizations.

Post-Course Plan

You will be given a detailed plan on how to both master Pandas and develop a routine for doing data analysis. You will also have access to the instructor in a private Slack chatroom.

Instructor



Ted Petrou is the author of Pandas Cookbook and founder of Dunder Data as well as the Houston Data Science Meetup group. He worked as a data scientist at Schlumberger where he spent the vast majority of his time exploring data. Ted received his Masters degree in statistics from Rice University and used his analytical skills to play poker professionally and teach math before becoming a data scientist.