A 5,000-to-1 Long Shot

Back in May 2012, shortly after the rollicking Premier League season in which Manchester City, tied on points with Manchester United, won the Premier League trophy on the back of Sergio Aguero’s thrilling last minute winner against QPR, an independent football analyst James Grayson published a short blog post on his simple, eponymous WordPress site. It was titled “Manchester United: A fading star?”

Grayson, a graduate student studying in British Columbia, Canada, had earlier developed an elegant predictive metric based on over a decade’s worth of shot data in the Premier League called Total Shots Ratio (TSR). Similar to the Corsi stat in current use in hockey analytics circles, TSR is a Pythagorean expression of team shots (total shots for/total shots for + shots against). Grayson ran some regressions and noted the stat was very good at predicting final point finishes and self-stabilized over a short number of games. “Not only does TSR correlate well with the number of points a team scores in a given season (R^2 = 0.66), but it also is an incredibly repeatable metric,” he would later write.

It’s not a perfect stat (as if there were such a thing), but when balanced against another metric PDO (shot% + save%) which is an indication of team luck, it could provide a more accurate picture of team performance than impression alone.

And so after using TSR to assess team performance at the end of the 2011-12 Premier League season, something caught Grayson’s attention involving Manchester United’s performance under Sir Alex Ferguson going back to the early 2000s. He wrote:

To summarise the pattern, United played at a consistent 0.600 level until the summer of 2003. A one season drop to the 0.550 level was followed by a one year dramatic rise to their peak, at 0.650. They then stayed at a remarkably high and consistent level until the summer of 2009, at which point their performance began to slide. Right now they sit at about 0.560. This drop off over the last two years is a pretty significant one. A team with a TSR of 0.632 will typically score 90 points, at a TSR of 0.550 we’re more talking an average of just 60.

Grayson went on to note the drop in TSR correlated with the departure of both Cristiano Ronaldo and Carlos Tevez. What was more remarkable however is how high United finished in the table that season, tied on points with City who won on goal difference, despite their poor TSR. Grayson nevertheless warned, “If [United’s TSR] continues at the same level it’s sure to catch up with them.”

Except it didn’t. At least not right away.

In fact, the following season Manchester United finished with 89 points, securing the division title for the 20th time despite a TSR of .534 which should have put them somewhere around 5th place. It was, in Grayson’s estimation, a “5,000 to 1 longshot.” “United are doing something to affect their performance which TSR cannot explain,” he wrote.

A failed model?

It would have been easy in May of 2012, and especially April of 2013, to dismiss Grayson’s work in light of Man United’s real world success. What good is a statistical model which can fail so badly in predicting the performance of one of the league’s best teams across an entire season?

For Grayson and other football analysts however, who view their work in terms of process rather than results, the “failure” in light of such a strong model was, in fact, a challenge. Some pored through other metrics like Expected Goal data or Game States to note any patterns to explain the discrepancy. Others looked to tactical explanations, or raised the possibility of blind luck playing a role in their success.

Yet there was another common sense explanation which to this day isn’t definitively supported by the data: Sir Alex Ferguson was one of the greatest managers who ever lived, and as such was able to get a decidedly mediocre team to somehow over-perform. It was a thesis already understood by many hardcore United fans who sounded the alarm well before Ferguson announced his retirement at the end of the 2012-13 season. Sir Alex Ferguson provided the winning edge for a team that just didn’t have good bones.

In 2013-14, with David Moyes replacing Ferguson on the bench, the chickens finally came home to roost. United under Moyes posted a TSR slightly higher than that of Sir Alex Ferguson’s championship season the year before, but finished in 7th place with 64 points. Moyes predictably, and certainly not unreasonably, received the bulk of the blame and was sacked in April before the season had finished.

There was a sense this summer that United would finally right the wrong of Moyes when they hired Dutch manager Louis van Gaal. Here was an experienced fighter who had just taken the Netherlands to a third place finish in the World Cup. The ship at Old Trafford was finally back on course.

There was one dissenter however amid all this ahead of the current Premier League season: Grayson. He didn’t write an anguished op-ed, or offer a subjective opinion based on a hunch of some sort. Rather this past August, he methodically refined a Team Rating metric improving on TSR’s strengths, and simply applied it to all twenty Premier League teams. His model, published in full and based almost entirely on the exotic football data known as ‘shots,’ predicts United will finish in 6th place.

The season is still in its infancy and all bets are off. But Man United under Van Gaal have already lost two and drawn two in all competitions, including a 4-0 loss to MK Dons in the League Cup. No serious analyst would draw definitive conclusions from these early results, but the possibility remains that the stewardship of Van Gaal, and some major summer additions including loanee Radamel Falcao, may not be enough to improve United’s form. A problem which Grayson addressed two years ago may persist at Old Trafford for at least another season, with no clear sense the team has a concrete plan in place to address it other than flashing cash in the transfer market.

The Story of Football Analytics

So why spend a thousand words of a debut analytics column on Man United’s continued struggles on the pitch? Because for me, the work of Grayson and others looking into Manchester United’s difficult transition following the departure of Sir Alex Ferguson is the story of football analytics. Here was a statistical model which stubbornly predicted United weren’t as good as the table suggested, and may have been over-performing under their legendary manager.

Running a football club involves making some very difficult decisions based on very complex judgments: why do teams underperform? When is it the players’ fault, and when is it the manager’s, and when is it both? What role does luck play? How can one reliably tell the difference?

Until recently, these judgments and decisions were made mostly on the basis of the league table and the subjective impressions of experts within and without the game. Sometimes this approach proves fruitful. Often, however, it is rife with all kinds of heuristic biases which distort judgments and can lead to decisions that further exacerbate the core problem. Sometimes these experts will change their view week-to-week, providing ready-made explanations for this win and this loss.

In contrast, as in Grayson’s look at Man United, the data drawn from sound statistical analysis may point to an underlying problem not necessarily reflected in the league table, which, if left unaddressed, can lead to major issues down the road. We also know via through the work of psychologists like Paul Meehl that in unpredictable fields like football, the “accuracy of experts is almost always matched or exceeded by simple algorithms,” to quote Daniel Kahneman’s Thinking Fast and Slow.

In a basic example, a team with a low TSR (measurement of skill) but a high PDO (measurement of luck) for example may temporarily storm up the table and impress the pundits, but that team will inevitably regress as the season progresses, for which those same pundit will no doubt have a ready-made explanation. Comparing TSR to PDO is a crude measurement of underlying performance, but it can be improved on with added layers of complexity, from measuring Expected Goals based on shot locations, to studying how teams perform when one goal down vs how they perform when one goal to the good.

These kinds of measures per se don’t necessarily offer solutions to the problems they diagnose, and neither do they provide incontrovertible proof for one perspective over another. But they are better than groping in the dark, and can provide a more sure foundation for long term planning and preparation. They can also offer flag potential problems before they manifest themselves on the pitch.

Emphasizing the importance of maintaining a good process despite individual results, highlighting the excellent work of Grayson and others doing some of the best work in analytics today, discussing ways in which clubs can practically apply the latest models and visualizations to day-to-day football operations, and finding ways to bridge the gap between theory and practice were some of the unsaid purposes of my analytics column for theScore. They will be my goal too for Trends in Analytics for 21st Club, a company for whom sensible long term planning based on sound data is a cornerstone. Stay tuned…