Udacity CEO Vish Makhijani delivering the Intersect 2018 Keynote, with map visualization on screen

For last week’s Intersect 2018 conference, I created a map visualization that was shown during the keynote speech from Vish Makhijani, Udacity’s CEO. The visualization highlighted the explosive growth of graduates from our Nanodegree programs over the last year. Below you can see the animation playing behind Vish for about 20 seconds during his speech.

Animations like these are extremely powerful for telling stories that evolve across time.

At the start of 2017, there were fewer than 6000 Nanodegree program graduates. By the end of 2017, there were over 25,000!

The visualization I created highlights not just the overall number of alumni, but also the geographical distribution of our students. For example, Artificial Intelligence was massively popular in China, Japan, and India, while the School of Business (including our Data Analyst and Digital Marketing Nanodegree programs) exploded in Brazil. It was a fun project for me and I wanted to share how I built it, so you can use these techniques and approaches in your work.

My general strategy was to generate an image for each day in the dataset, then convert all those images into a video. To do this, I used Python and a few packages: Pandas, for loading and manipulating the data, Cartopy, for drawing the map, and Matplotlib, for plotting the data. After generating all the images, I used ffmpeg to combine the individual frames into the video you see above.

Below, I’ll detail how I created this video so you can do the same in your projects. First off, I’ll import the necessary packages.

from datetime import datetime, timedelta import matplotlib.pyplot as plt

import cartopy.crs as ccrs

import pandas as pd

The dataset I used consisted of four columns:

School of [AI, Business, Developers, Autonomous Systems]

Graduation date

Longitude

Latitude

I loaded the CSV file with Pandas and named each column appropriately. The graduation dates are read in as strings so I converted them to datetime objects with pd.to_datetime .

df = pd.read_csv(‘graduations_data.csv’, names=[‘School’, ‘Grad Date’, ‘Long’, ‘Lat’]) # convert the graduation date column to datetime objects

df[‘Grad Date’] = pd.to_datetime(df[‘Grad Date’])

Once I had the data, I created a figure and an axis with matplotlib using a figure size of 19.2 x 10.8 inches. I wanted the final video to have a resolution of 1920 x 1080 so I chose this size and saved the figures at 100 DPI (dots per inch). I created the axis using the Mercator projection with Cartopy.

fig = plt.figure(figsize=(19.2, 10.8))

ax = plt.axes(projection=ccrs.Mercator(central_longitude=0,

min_latitude=-65,

max_latitude=70))

A projection is how you transfer the surface of the Earth (a sphere) to the flat plane of a map. There’s no perfect way to accomplish this, so I stuck with the Mercator projection which is often seen in U.S. schools and apps like Google Maps. From there I added the background map image, taken from NASA’s Blue Marble set, and defined the part of the map to display with ax.set_extent . Also note that I used a low resolution image here—this speeds up developing the code. When I was ready to generate the frames for the video, I used the full resolution background image.

ax.background_img(name='BM', resolution='low')

ax.set_extent([-170, 179, -65, 70], crs=ccrs.PlateCarree())

We can use latitudes and longitudes, but Cartopy needs to know how to convert those to the appropriate locations on the map. For that, I used the ccrs.PlateCarree() projection. I’ll use the same projection later in the code to place data points and text. At that point, I had the basic map which I can start adding to.

World map displayed with Cartopy

Once I had the map, I started adding data to it. I wanted each image to show the map background as well as dots indicating the location of students who graduated before some date. I set a date with datetime (December 31st, 2017 for example) and got all the graduations before that date.

date = datetime(2017, 12, 31)

grads = df[df['Grad Date'] <= date]

With the graduation data, I looped through each school and placed a dot indicating the location. I also wanted the size of the dot to indicate the number of graduates at that location. To do that, I used .groupby([‘Long’, ‘Lat’]).count() which groups by longitude-latitude pairs and counts up the number of graduates for each pair.

# Define colors for each school

colors = {'AI': '#02b3e4',

'Aut Sys': '#f95c3c' ,

'Business': '#ff5483',

'Developers': '#ecc81a'} for school, school_data in grads.groupby('School'):



grad_counts = school_data.groupby(['Long', 'Lat']).count()



# Get lists for longitudes and latitudes of graduates

index = list(grad_counts.index)

longs = [each[0] for each in index]

lats = [each[1] for each in index] sizes = grad_counts['School']*10 # The school names are like 'School of AI', remove 'School of '

school_name = ' '.join(school.split()[2:])



ax.scatter(longs, lats, s=sizes,

color=colors[school_name], alpha=0.8,

transform=ccrs.PlateCarree())

World map with graduations data

With the data plotted, it was time to add text for the date, the number of graduates, and labels matching colors and schools.

fontname = 'Open Sans'

fontsize = 28 # Positions for the date and grad counter

date_x = -53

date_y = -50

date_spacing = 65 # Positions for the school labels

name_x = -70

name_y = -60

name_spacing = {'Developers': 0,

'AI': 55,

'Business': 1.9*55,

'Aut Sys': 3*55} # Date text

ax.text(date_x, date_y,

f"{date.strftime('%b %d, %Y')}",

color='white',

fontname=fontname, fontsize=fontsize*1.3,

transform=ccrs.PlateCarree()) # Total grad counts

ax.text(date_x + date_spacing, date_y,

"GRADUATES", color='white',

fontname=fontname, fontsize=fontsize,

transform=ccrs.PlateCarree())

ax.text(date_x + date_spacing*1.7, date_y,

f"{grads.groupby(['Long', 'Lat']).count()['School'].sum()}",

color='white', ha='left',

fontname=fontname, fontsize=fontsize*1.3,

transform=ccrs.PlateCarree()) for school_name in ['Developers', 'AI', 'Business', 'Aut Sys']:

ax.text(name_x + name_spacing[school_name],

name_y,

school_name.upper(), ha='center',

fontname=fontname, fontsize=fontsize*1.1,

color=colors[school_name],

transform=ccrs.PlateCarree()) # Expands image to fill the figure and cut off margins

fig.tight_layout(pad=-0.5)

Background map, graduate data, date, and labels

With the code to generate a single image, I can put it all in one function and loop through all the dates I’m interested in, calling that function for each date.

def make_grads_map(date, data, ax=None, resolution='low'):



if ax is None:

fig = plt.figure(figsize=(19.2, 10.8))

ax = plt.axes(projection=ccrs.Mercator(min_latitude=-65,

max_latitude=70))



ax.background_img(name='BM', resolution=resolution)

ax.set_extent([-170, 179, -65, 70], crs=ccrs.PlateCarree()) grads = data[data['Grad Date'] < date]



### rest of the code start_date = datetime(2017, 1, 1)

end_date = datetime(2018, 3, 15) fig = plt.figure(figsize=(19.2, 10.8))

ax = plt.axes(projection=ccrs.Mercator(min_latitude=-65,

max_latitude=70)) # Generate an image for each day between start_date and end_date

for ii, days in enumerate(range((end_date - start_date).days)):

date = start_date + timedelta(days) ax = make_grads_map(date, df, ax=ax, resolution='full')

fig.tight_layout(pad=-0.5) fig.savefig(f"frames/frame_{ii:04d}.png", dpi=100,

frameon=False, facecolor='black')

ax.clear()

When I first wrote this I was creating a new figure and axis in each loop. However, this led to the memory usage exploding. Instead, I cleared the axis at the end of each loop with ax.clear() then replotted the map and data on the same axis.

After generating all the images, I treated them like frames in a movie. There are a lot of different software options for converting images into a movie, I chose ffmpeg. I set the frame rate such that the video came out 20 seconds long and at 1920x1080 resolution.

ffmpeg -framerate 21 -i frames/frame_%4d.png -c:v h264 -r 30 -s 1920x1080 ./grads.mp4

I hadn’t worked with map data before this so it was a great learning experience for me. I actually started this project using Basemap instead of Cartopy to generate the map. However, Basemap isn’t being supported anymore and doesn’t work with the newest versions of Matplotlib. I had to change a few lines of code in the Basemap package to generate the video. After making the video, I ended up rewriting the map code with Cartopy for this blog post.