Is it getting hot in here, or is it just me? You’ve no doubt seen the barrage of coverage discussing climate change over the last century. But how do you separate the hype from the facts?

Let’s go straight to the source.

Today we’re going to use a dataset sourced directly from NOAA (National Oceanic and Atmospheric Administration) and plot that data in Python using Matplotlib.

NOAA has a wide variety of datasets tracking all kinds of things, some of them reaching back hundreds of years. For this tutorial, we’re going to use a dataset tracking global land and temperature anomalies each June. The dataset reaches all the way back to 1880, so that gives us a lot to work with.

Let’s see what the data has to say.

Access the Dataset

The first thing you need to do is access the proper dataset from NOAA. They have a whole data gallery you can browse, but for this example we’ll be using the Climate at a Glance dataset. It comes in CSV format and shows a date, the mean temperature, and the variation of that mean from the average temperature between 1901-2000. That way we can see how much higher or lower the mean temperature is from the “average” temperature across the last century.

Set up Dev Environment

Once you’ve downloaded the dataset, we need to get our development environment set up.

You’ll need to have Python 3.6 installed on your machine for this tutorial. We’ll begin by setting up a virtual environment to manage the dependencies. This uses the Python package virtualenv. If you don’t have it installed, you can access it by entering pip install virtualenv at your command line.

$ mkdir climate_data

$ cd climate_data

$ virtualenv -p /usr/local/bin/python3 climate

$ source climate/bin/activate

This creates and activates a Python environment within the climate_data folder, so you can install your dependencies and not deal with conflicts from other Python versions or libraries. Your shell prompt should look something like this now:

(climate) Als-MacBook-Pro:climate_data alnelson$

The next thing we need to do is install matplotlib, which will help us plot the data on a graph.

$ pip install matplotlib

Once that’s done, we’re ready to move on to the coding part of this tutorial.

Import the Data

Create a Python file called climate.py and open it in your favorite text editor. Then import the necessary libraries:

import matplotlib as mpl import numpy as np import matplotlib.pyplot as plt

Note: if you’re on Mac OSX, then you may see an error when you try to import pyplot. This is a known issue with matplotlib and virtualenv. Luckily, you can use this workaround. Enter these lines right after the numpy import if you’re getting errors:

mpl.use(‘TkAgg’) import matplotlib.pylot as plt

The next thing we need to do is load in the CSV data file. We do that using NumPy’s genfromtxt function, like so:

data = np.genfromtxt('global_data.csv', delimiter=',', dtype=None, skip_header=5, names=('date', 'value', 'anomaly'))

We’ll break this down a little at a time. First, you need to enter the name and path of your data file. In my case, I have the dataset in the same directory as my Python file. Make sure you’re pointing to the correct file location. Next, I specify the delimiter which is ‘,’ since it’s a CSV. dtype=None tells the interpreter to automatically assign data types based on the data that appears in the columns. skip_header tells it to skip the first 5 rows, because if you look at the dataset in a text editor, you’ll see that the first 5 rows are description. Finally, we tell NumPy what each column is called and save it to a variable called data.

Graph the Data

Now that we’ve got our data loaded in, we need to set up matplotlib to receive it.

plt.title(“Global Land and Ocean Temperature Anomalies, June”) plt.xlabel(‘year’) plt.ylabel(‘degrees F +/- from average’) plt.bar(data[‘date’], data[‘value’], color=”blue”) plt.show()

You should now have a plot that looks like this:

What does this mean for our Earth? Well, you have the data. I’ll let you be the judge.

Next Steps

If you wanted to do some other interesting experiments, you could look up the weather on the day you were born, or on major election days. Perhaps you could combine the weather with polling data to see if there was any correlation. NOAA has lots of datasets for you to play with, so go and check it out.