Fishermen have no way of separating the fish they catch when they cast their nets at sea. Protected species and fish with no market value -- the hammerhead shark, for example -- end up being trapped and dying for no reason. In an attempt to minimize this incidental fishing, statisticians from the University of Geneva (UNIGE, Switzerland), Dalhousie University (Halifax, Canada) and the Australian National University (Canberra) have devised a new statistical method for predicting bycatches more accurately in the future. The technique, which is explained in full in the journal Annals of Applied Statistics, can also be applied to other research fields, including health economics, medicine and educational science.

When fishermen set out on their expeditions at sea, protected species are caught up accidentally in their nets alongside the fish intended for sale. Biologists are collecting datasets on fish numbers and species conservation figures so they can study the volume of incidental fishing and its impact on marine fauna. The structure of this data, known as "nested," is complex because it integrates a mass of technical information, such as the number of expeditions or the type of boats used. The data also records the amount of protected fish caught in the nets on each fishing trip. However, some species -- the hammerhead shark is one such case -- are usually not caught, making it difficult to establish models that include the number of nil catches for each species. "Until now, there has been no general statistical method that combines a nested data structure with a large quantity of zeros in the observations," explains Eva Cantoni, professor at the Research Center for Statistics at UNIGE's Geneva School of Economics and Management (GSEM). "So this gap needed to be filled, which we did by setting up a very general and flexible model, called the Random-Effects Hurdle Model."

The complexity of generality

The statisticians developed a new method with the ultimate goal of introducing managed fishing and reducing bycatch. "We had to take a range of dynamics into account," continues Cantoni. "The aim was not just to analyse the changes in the number of catches over time but also to study the different seasons and the weather, all the while factoring in the technical conditions: the depth of the nets, the seasons (as I've already mentioned), the type of hooks used, whether light sticks were used or not, and the kind of vessel." Based on this data, the researchers identified the easily influenceable conditions (such as the depth of the hooks) that would reduce the volume of non-marketable species that are caught.

The statisticians then created a new methodology that combined older models specialising in either nested structures or zero management. "The difficulty lay in bringing these two aspects together while ensuring the model was as general as possible so that it could adapt to many situations," says Joanna Mills Flemming, from the Department of Mathematics and Statistics at Dalhousie University. The more general a model is, the more complex it is to process. Modern simulation techniques were used to estimate the model's parameters (related, for example, to the depth of the hooks) and their variability. The authors demonstrated theorems that determine and quantify the margins of error for the model and its predictions. Preventing incidental catches and supporting environmental policy. This modelling means it is now possible to estimate potential bycatches for a fishing expedition. "When fishermen give us their voyage data, we can predict the incidental catch for hammerhead sharks, for example, with more precision," states Cantoni. "The method can be used to back up environmental policies by prohibiting fishing at a certain depth at a particular time of year since it would involve too much bycatch," adds Alan Welsh from the Australian National University.

The model fills a statistical gap: previously, there was no general model capable of simultaneously factoring in complex and nested data structures and a high number of observations equal to zero. Today, the new model does not just serve commercial fishing: it can also be used in other areas with complex data structure, including health economics, medicine and educational science.