Introduction to Data Analysis

François Briatte and Ivaylo Petev

Sciences Po, Euro-American Campus

Spring 2013

This course is an introduction to analyzing data with the R software. It features some mathematics and statistics as well as some statistical computing and data visualization. You will need a laptop with an Internet connection to follow the class.

To get started, download the entire course. To take a look at what the course material is made of, view it on GitHub first. It's not a large download.

Part 1: Introduction to Statistical Computing. The course starts by one month dedicated to setting up R and learning its basic functionalities. All course logistics will be discussed in these weeks.

Readings: Kabacoff, ch. 1 and Teetor, ch. 1 and 3.

Readings: Teetor, ch. 2 and 5.

Readings: Kabacoff, ch. 5, and Teetor, ch. 8.

Readings: Kabacoff, ch. 4, and Teetor, ch. 4 and 6-7.

Part 2: Introduction to Statistical Analysis. The course continues by showcasing some statistical techniques, from finding clusters of related data to modelling relationships between several variables.

Readings: Kabacoff, ch. 14, and Teetor, ch. 13.4, 13.6 and 13.9.

Readings: Kabacoff, ch. 6, Teetor, ch. 10, and Urdan, ch. 4-6. See also Urdan, ch. 1-3, if you have forgotten everything about statistics.

Readings: Kabacoff, ch. 7, Teetor, ch. 9, and Urdan, ch. 7, 9 and 14.

Readings: Kabacoff, ch. 8 and 11, Teetor, ch. 11, and Urdan, ch. 8, 10 and 13. Skim ANOVA to focus on OLS (simple and multiple linear regression).

Part 3: Introduction to Data Visualization. The course ends by focusing on the graphic dimension of quantitative data. We will also try to have guest speakers to talk about their professional use of data.

Reading: Teetor, ch. 14. Focus on detrending and read ARIMA only if you plan to earn millions as a financial analyst.

Special guest: Joël Gombin on reproducible science.

Special guest: Alexandre Léchenet on data-driven journalism.

Special guest: Samuel Goëta will speak in the Distinguished Lecture Series on open data and open government.

We're done!

Thanks to Baptiste Coulmont, Joël Gombin and Timothée Poisot for very valuable advice and comments, to GitHub for hosting and to users at StackExchange for coding assistance.

Special thanks to the Sciences Po Reims staff and students for indefectible support.

Inspired by Christopher Adolph, Dave Armstrong, Christopher Gangrud, Andrew Gelman, Rebecca Nugent, Gaston Sanchez, Cosma Shalizi, David Sparks and Hadley Wickham.

This course has its own GitHub repository; fork at will. This HTML version was compiled from source on Thursday January 09, 2014.