Shark Week. The best week of the entire year.

Why do we love it so? Is it the extensive description of blue sharks’ social patterns? The scientific explanation of the effect of water salinity on the health of tiger sharks? The lengthy summary of the whale sharks’ migration? Of course not.

We are drawn to Shark Week year after year for one reason only: our morbid fascination with shark attacks.

This very fascination led my colleagues and me to do our own pre-Shark Week homework. If Shark Week has taught us anything, it is to be anxious while hanging out in the water just off the beach—we all know that surfers and swimmers are the most common victims of shark attacks.

Are these nerves justified, we wondered? And should we worry when taking part in other ocean activities?

In an attempt to determine whether certain ocean activities lead to deadlier shark attacks than others, we used SAS Viya to analyze data from sharkattackdata.com detailing the events that led up to more than 5,000 shark attacks worldwide.

Initial Exploration of Shark Attacks

Using SAS Viya’s Visual Analytics 8.1’s text analytics capabilities, we first identified the most frequent words and phrases describing the activity of the swimmer at the time of the attack. Viya’s default settings allow us to quickly see the most frequent activities: the larger the word, the higher the frequency of the term or phrase. As we theorized, surfing and swimming were the two most common terms.

While all shark attacks should be understood and taken seriously (except that one where a person was bitten by a shark because the swimmer attempted to grab the shark for a selfie—yes, that happened) we needed to focus our work on identifying the activities which correlate most strongly with fatal attacks.

By subsetting the data to look only at fatal attacks we see a somewhat different story. While surfing was the most common activity of all shark attacks, it accounts for a smaller proportion of fatal shark attacks. We see other types of activities accounting for higher frequencies when viewing the data that represent fatal attacks.