We’re getting to that time of year again where an omniscient rodent informs us whether or not we can put away our winter gear early and get ready for fun in the sun. I’m, of course, talking about Groundhog Day, the North American tradition of turning to the wisdom of groundhogs for the weather forecast.

The most famous of the groundhogs is Pennsylvania’s Punxsutawney Phil, whose ancestors (all named Phil) got into meteorology back in 1887 after failing to find success in the burrowing business.

Every February 2, thousands gather in Punxsutawney to watch as Phil is wrenched away from important groundhog business so that he may predict the weather for us. When he emerges from his burrow, if he sees his shadow, it means we’ll have six more weeks of winter. If he doesn’t see his shadow, we’ll have an early spring.

Phil has been at it for more than a century, so he must be doing a hell of a job. But every so often it’s good to take a step back and check our assumptions. How accurate are Phil’s weather predictions?

Fortunately, there are methods for us to answer such a question. In our endeavor to do so, I will be drawing from fields such as data science, statistics and machine learning. I do my best to explain concepts at a high-level here, but in no way are my explanations complete or precise.

(If you’re interested in the nitty-gritty, the data and code can be found at: https://github.com/docmarionum1/Groundhog-Day)