Mining Social Media

Oh yay! You found the web-version of the book Mining Social Media! Here you can learn how to mine, process, and analyze data from the social web in meaningful ways with the Python programming language.

On the pages of this web site you’ll learn how to use technical tools to collect and analyze data from social media sources to build compelling, data-driven stories.

Learn how to:

Write Python scripts and use APIs to gather data from the social web

Download data archives and dig through them for insights

Inspect HTML downloaded from websites for useful content

Format, aggregate, sort, and filter your collected data using Google Sheets

Create data visualizations to illustrate your discoveries

Perform advanced data analysis using Python, Jupyter Notebooks, and the pandas library

Apply what you’ve learned to research topics on your own

My publishers at No Starch and I really wanted to ensure that people of all socioeconomic backgrounds have access to this book, so this is a free version of it. But if you do have the means and would like to support us, you can buy a copy of the ebook or the physical book at No Starch Press.

Buy the book!

About the author

Lam Thuy Vo is a senior reporter at BuzzFeed News where she digs into data to examine how systems and policies affect individuals. She’s explored how excessive ‘quality-of-life’ complaints led to the over-policing of minorities, how badly constructed algorithms helped spread hate-speech and warp our understanding of politics, and how changes in immigration enforcement drove immigrants into the arms of fraudulent lawyers. Previously, she’s led teams and reported for The Wall Street Journal, Al Jazeera America and NPR’s Planet Money and told economic and political stories across the U.S. and throughout Asia.

She’s spent the past few years building an expertise in an increasingly relevant practice: investigating the social web. She’s brought her research to scholars in closed and public circles at institutions like Harvard, MIT, Columbia and Data and Society and has written a Python book about her empirical approach to finding stories in data from the Internet for No Starch Press.

Outside of her job, she has also worked as an educator for a decade, developing newsroom-wide training programs for institutions like Al Jazeera America and The Wall Street Journal, workshops for journalists around the world and semester-long courses for the Craig Newmark CUNY Graduate School of Journalism. Please find some tip sheets, courses and workshop materials she’s created here.

She’s committed to helping her industry become more diverse. She co-administers a slack community for journalists of color and co-created a resource guide for journalists of color looking for career growth, salary data, demographics breakdowns of newsrooms and training opportunities.

In her spare time, you can find her making data visualizations with social data, producing playful videos or rock climbing.