CS109 Data Science

Predicting Hubway Stations Status by Lauren Alexander, Gabriel Goulet-Langlois, Joshua Wolff

Learning from data in order to gain useful predictions and insights. This course introduces methods for five key facets of an investigation: data wrangling, cleaning, and sampling to get a suitable data set; data management to be able to access big data quickly and reliably; exploratory data analysis to generate hypotheses and intuition; prediction based on statistical methods such as regression and classification; and communication of results through visualization, stories, and interpretable summaries.

We will be using Python for all programming assignments and projects.

The course is also listed as AC209, STAT121, and E-109.

Instructors

Pavlos Protopapas, SEAS

Kevin Rader, Statistics

Mark Glickman, Statistics

Chris Tanner, SEAS

Joe Blitzstein, Statistics

Hanspeter Pfister, Computer Science

Verena Kaynig-Fittkau, Computer Science

Material from CS 109 taught from present to 2013