There are two common ways to get data from a given website: Web scraping and APIs. Many websites have official APIs which can be used to get structured data in just a few lines of code, and web crawling is extremely versitle and can access the most up-to-date information. I won't go into the merits and drawbacks of each method here, but instead present an example of each.

In Part 1 I present a (slightly) modified version of a web scraper and parser used by fivethirtyeight.com to retrieve weather data from Wunderground.com.

In Part 2 I go over using an API to get weather and location data from NOAA.

The original source for the wunderground code can be found here.

I've modified the code in a couple of ways to make things work better for me, but largely the code is the same structure.

Wunderground Scraper¶

To use this scraper for your own purposes, you will first need to find the names of all the weather stations from which you would like to gather data. Since I live in North Dakota, I would like to access a few stations from around the state. For each city I'm interested in, I can punch in the name to the Wunderground search bar and write down the station name. I chose seven stations from around the state to start.

City Name Station Name Bismarck KBIS Fargo KFAR Grand Forks KGFK Devils Lake KDVL Jamestown KJMS Minot KMOT Williston KISN