A group of researchers have discovered an objective way to measure the most important members of a soccer team. Yes, there is a scientific reason why you love Lionel Messi. It all comes down to statistics.


It's relatively straightforward to calculate individual performances in team sports that can be easily divided into lots of discrete events, whether it's at-bats in baseball, offensive possessions in basketball, or downs in football. When each game is made up of dozens or even a hundred such moments, it's a lot easier to drill down and identify which players made an impact in various situations.

Soccer exists on the absolute opposite furthest end of this spectrum. With the exception of halftime, the games never stop, and goals are so rare (as anyone watching the current World Cup can attest) that it's pretty much impossible to accurately judge who played well based purely on who scored. So then, what's a budding sabermetrician supposed to do with the elegant game?


Three researchers at Northwestern think they've figured out a solution. The key was identifying the real objective in a soccer game isn't so much scoring goals as it is moving the ball away from your own goal and towards the opposing team's, thereby maximizing your team's scoring opportunities. As such, players that are successful in maintaining possession of the ball for their team maximize their team's chances of success. (This is a lot like the sabermetric tenet that the most important thing a baseball team can do is not make outs, which is a big reason why on-base percentage enjoys its current exalted status.)

To chart this, the researchers created "flow networks", in which each player on the team is identified by a node, and then all the attempted passes between each combinations of players are charted, along with two additional nodes that identify shots that missed and shots that scored. This network reveals two essential statistics for each player: passing accuracy, or the amount of passes that are successfully completed to a teammate, and shooting accuracy, or the amount of shots that result in goals.

Taken together, these two metrics chart how successful each player is at contributing to the team's scoring opportunities. When placed in the context of the flow network, they reveal the myriad paths from player to player that the ball can take, and how likely each path is to result in a shot on goal. This in turn provides a measure of each player's flow centrality, or the percentage of the time they are involved in a ball path that leads to a shot. The researchers believe it's this last metric, flow centrality, that holds the key to better quantitative analysis of individual soccer performance.

The new metric also provides some solid support for the argument that superstars do matter, even in a quintessential team game like soccer. Using the 2008 European Cup as their data set, they found the highest correlation between a team's chance of winning and the flow centrality of its top two performers. And the metric is generally a good predictor - they found it doesn't take much of a flow advantage for a team to have 3:1 odds of winning a given match.


Their results also accord well with the more subjective views of those watching the games, always a good sanity check for new, advanced statistics. The researchers compared their list of the Euro Cup's top twenty performers with those selected for team of the tournament honors. The two lists had eight players in common, which is a shockingly significant result: even just four players in common would have just a 1/100,000 probability of happening by chance, and there's essentially zero chance that an eight player overlap could happen by coincidence.


And it's not just athletics for which flow networks can be used to chart individual performances. The team suggests any cooperative activity in which information is passed from one person to another - say, a project that involves lots of phone calls or emails - can be calculated using these networks. They looked at some of their own scientific papers and how each had interacted with their coauthors, charting which communiques led to forward progress like new data or even completion, which led to scheduling a meeting, and which came to nothing.

Although they are very optimistic about providing a hugely useful metric to the field of soccer analysis, they are much more cautious when it comes to extending their work. This is probably wise, considering they came this close to providing a method to figuring out who really pulls their share of the weight on scientific papers. Ask any scientist, and I'm sure they'd prefer that level of statistical precision remains safely confined to the soccer pitch.


[PLoS ONE]

Top image shows Messi on the left against South Korea and Diego Maradona against Belgium - via the Telegraph.