The Data Behind a Season Without Snow Days

How behavioral Google search data and scientific weather forecast data play into the federal government’s snow day decision-making.

I took this on January 23, 2016, amid “Snowzilla,” which dumped ~22 inches on the Lincoln Memorial.

As spring arrives in Washington, D.C., we close the chapter on an abnormal winter in the District: For the first time in the last five years, the federal government did not declare a single full day closure due to snow.

D.C. had its federal snow day chances. On March 14, D.C. braced for Snowstorm Stella, which threatened six or more inches. Typically, such threats cause the all-powerful federal snow day authority Office of Personnel Management (OPM) to declare a closure. But OPM didn’t bite and only issued a three-hour delay with the option to work remotely. Even my office — General Assembly’s D.C. campus, where I’m a full-time Data Science Immersive instructor — declared a snow day, as did many others. (OPM was probably right this time: Only a light dusting followed exaggerated predictions.)

Because of the excitement surrounding impending snow, it seemed like a given that OPM would issue a closure, as it has in the past. It made me think about whether there was any relationship between the emotional buildup resulting from a snowy-weather forecast and the chances of the OPM actually declaring a snow day.

I wondered: Are OPM’s opaque closure decisions truly, entirely based on the forecast, or are they also susceptible to human impulse?

To answer this question, I turned to data science.

The Data

To evaluate the root of OPM’s closure decisions, I decided I needed three datasets: the history of all federal government snow day closures, historic weather forecast data, and a heuristic method to determine human excitement leading up to a snow day.

Federal Government Closure Data

Gathering historic government data on federal closures proved simple. The Office of Personnel Management maintains a Snow & Dismissal Procedures archive. I automated the data collection and open sourced the ability for you to do the same.

Historic Weather Forecast Data

Collecting historic weather forecast data proved surprisingly challenging. Note I’m not obtaining historic weather data, but historic weather forecast data. This difference is critical: I’m basing all my analysis on the weather data that was available to OPM at the time of its decision (which I’m assuming is 11 p.m. the night preceding a snow day).

While gathering historic weather data is quite straightforward, historic weather forecast data is a bit harder to find. I suspect this is because weather reporters would rather release what happened than what they thought would happen — I should audit weather reporters for my next project.

I used an API called Dark Sky and open sourced my script for doing so.

Behavioral Snow Day Excitement Data

To collect behavioral data on the excitement surrounding a possible snow day, I turned to Google Trends. Google Trends enables anyone to see the popularity of a search term within a given geography and time range. This is an excellent way to encapsulate snow day excitement: Rumors of snow encourage searches for news coverage and forecasts.

Google supports the ability to collect the last five years of Trends data at the weekly level. Units are measured on a 0 to 100 scale based on relative search volume, where a term achieves a 100 at its most popular search frequency within the user-provided timeframe.

The Analysis

Before examining how each factor plays into a snow day, I first visualized what OPM’s closure decisions have been like over time.