I have not missed an episode of American Ninja Warrior (ANW) since its fourth season in 2012. And while ANW is certainly more of a reality TV show than a traditional sport, it feels like too little emphasis is placed on its few sport-like qualities. This is perhaps most evident in the practically nonexistent use of data and statistics—two cornerstones of most sports. Questions like “What’s Elet Hall’s average time on the Quintuple Steps?” or “How long does Joe Moravsky usually rest between obstacles?” go largely unanswered.

In an attempt to answer some of these questions, I’ve collected data on every televised run during season 7. In a series of posts, I intend to discuss the process, my findings, a few opportunities for visualizations, and a new statistic—similar to the NBA’s Player Efficiency Rating—that will aid in the creation of ANW power rankings.

Data Collection

Obviously, there are a number of challenges involved in trying to take accurate splits from a TV broadcast—a few of which are outlined below.

Misrepresentation of elapsed time: You’ll notice that in many runs there are “jumps” in time. For example, after showing a competitor’s family for three seconds, the clock will have jumped 40 seconds. Camera panning: The key points for split-taking are when a competitor starts and completes an obstacle. Unfortunately, these moments are not always available to the viewer. Not every competitor is shown: A competitor’s run can be completely shown, partially shown, or not shown at all.

The workaround for (1) is to only use the official clock (rather than a stopwatch, for instance). This ensures that the split estimates sum to the official run time. Consider, for example, Kevin Bull’s Qualifying run from season 7:

Transition Elapsed Time (sec) Obstacle Elapsed Time (sec) 0 0 Quintuple Steps 2.73 1 7.09 Mini Silk Slider 2.41 2 1.07 Tilting Table 1.29 3 1.14 Spin Cycle 8.93 4 3.50 Hourglass Drop 18.91 5 5.56 Warped Wall 3.77

We see that the sum of Kevin’s individual splits (56.4 seconds) matches his official clocking. And while this may seem rather time consuming at first glance, it doesn’t require much more effort than simply watching does. This is especially true since I’ve written two simple Python scripts to do most of the heavy lifting: split_taking.py (for courses with no time limit) and split_taking_mt.py (for courses with a time limit—e.g., 2:30.00).

(2) and (3), on the other hand, are simply limitations of the method: we’re going to have incomplete data due to some runs not being shown and we’re going to have some inaccuracy introduced by inconvenient camera positions. That said, I still think this project provides an interesting look at the role that data and statistics could play in ANW.

The Database

As I watched each episode, I created CSV files consisting of rows like the following:

Kevin Bull,30,M,2.73,7.09,2.41,1.07,1.29,1.14,8.93,3.50,18.91,5.56,3.77,56.4,Completed

These CSV files are then converted into a SQLite database (the structure of which is shown above), making the answers to many interesting questions a mere query away. For example, the fastest times on the Quintuple Steps from Venice Qualifying:

Name City Category Time (sec) Kevin Bull Venice Qualifying 2.73 Brendon Ayanbadejo Venice Qualifying 2.91 David Campbell Venice Qualifying 2.92 Alan Connealy Venice Qualifying 3.03 Brian Kretsch Venice Qualifying 3.86 Copy SELECT Ninja . name , Course . city , Course . category , ObstacleResult . time FROM ObstacleResult JOIN Ninja ON ( ObstacleResult . ninja_id = Ninja . id ) JOIN Obstacle ON ( ObstacleResult . obstacle_id = Obstacle . id ) JOIN Course ON ( Obstacle . course_id = Course . id ) WHERE ( Obstacle . obstacle_name = "Quintuple Steps" AND ObstacleResult . completed = 1 AND Course . id = 6 ) ORDER BY ObstacleResult . time ASC

I’m sure this list isn’t particularly surprising to anyone who follows ANW but, as you can probably imagine, a database like this can answer many more questions (some of which will be discussed in subsequent posts).

Run Flow

The first visualization we’re going to discuss is called the “Run Flow.” It attempts to provide a means for both analyzing an individual run and comparing two different runs. Check it out below!

Ninja 1 Ninja 2

As you can see, the default parameters compare the two fastest runners of the night: Kevin Bull and Alan Connealy. It was a fairly tight race until the third transition, at which point Bull pulled away for good. Another interesting comparison is between Alan Connealy and Nicholas Coolridge, who collectively had the closest finishing times of the night.

Up Next

In the next post in this series, I plan to take a closer look at the Qualifying and City Finals episodes: Which course and obstacle were the most difficult? Who had the most impressive run?

I’ve also considered making this data freely available through a website consisting of information such as a leaderboard (overall, men, and women) of the top-15 competitors and an individualized profile page for each competitor containing career, course and obstacle statistics. I’ll discuss this idea more thoroughly in future posts but, in the meantime, you can stop by the project’s Github repository.

As always, feel free to contact me with any comments or suggestions.