By Andrew Puopolo



This weekend marks the 3rd Round of the Football Association Cup, the oldest and most prestigious domestic cup competition in the entire world. Every year, amateur and professional teams from across England compete to lift this coveted trophy at London’s Wembley Stadium in May. The third round is when teams from the Premier League enter the competition, and there are quite a few “David vs Goliath” matchups involving Premier League teams, including Tottenham Hotspur’s trip tomorrow night to 4th tier Tranmere Rovers, Arsenal’s journey to 3rd division Blackpool Saturday lunchtime in a rematch of their League Cup 4th round tie and Watford’s visit to 6th tier Woking (who are coached by legendary commentator Martin Tyler) on Sunday afternoon. Each of the 64 teams still remaining in this competition will have dreams of either knocking off a top side or to reach the semifinals and finals at Wembley.



Throughout the 20th century, Tottenham Hotspur were considered to be the “specialists” of the FA Cup, having won the competition 8 times between 1901 and 1991 (including 5 times when the year ended in one) but only the top division title twice. However, their claim on the crown has been challenged by their two biggest London rivals in recent years as Spurs haven’t won In the early 2000’s, current holders Chelsea went on to become the cup kings, as they won twice in the late nineties and then four times in six years between 2007 and 2012. However, in recent years the competition has begun to be dominated by Arsenal, who won the cup 3 times in 4 years despite not winning the league title since 2004.



This begs the question, which team are the true FA Cup specialists? In other words, which team has historically performed better on average than expectation?



To answer this question, I decided to fit an ELO model (based on the results of league matches only) to come up with a relative proxy for a team’s strength, and calculated this value for each team in all four tiers of English football on January 1 of the relevant season (as the 3rd round is traditionally held the first weekend in January). For each season from 1960/61 to 2016/17, I calculated how far each team in the dataset went in the competition. These values ranged from 1 (a 3rd or 4th division team being eliminated in the 1st round) to 9 (winning the whole tournament). Using the ELO values as a predictor, I fit an ordered logit model (R output at end of model for all you stats nerds) to estimate a probability distribution for how far a team should have gone in the tournament given their performance in the league. I then used this distribution to determine how likely it was that the team went as far as they did in the competition, and the probability that a team was knocked out at that round or further.



In order to understand how these probabilities work and how we are applying them to solve this problem, consider the possibility where the ordered logit model tells us that Manchester United had a 20% chance of reaching the quarterfinal and losing, a 40% chance of reaching the semifinal and losing, a 30% chance of reaching the final and losing and a 10% chance of winning the entire tournament. If they were knocked out in the final, then the probability of them reaching at least the final is 40%, and the probability of them being knocked out in the final or earlier is 90%.



After compiling these statistics for all teams in all seasons, we then calculated the geometric mean of each teams probabilities to try to determine which team are the true “FA Cup specialists.” The geometric mean was chosen because it is similar to computing the log-likelihood of all these FA Cup finishes, but adjusts for the fact that not all teams have competed for 59 seasons. A lower geometric mean means that it is less likely that the team performed as well as they did, and provides more evidence to that team being a cup specialist.



We end up with all three London clubs that we mentioned earlier, as well as Manchester United. Fans who have only recently watched English football will be surprised by this as United have won the league 13 times between 1993 and 2013, but the pre Alex Ferguson United were not world beaters, but had considerable success in the cup (relative to the league), winning the cup in 1977, 1983, 1985 and 1990. Watford are an interesting team to see on this list, as they were beaten finalists in 1984 against Everton (another feature on this list), as well as beaten semifinalists in 1970, 1987, 2003, 2007 and 2016.



However, seeing Peterborough United at the top of this list was incredibly shocking, as they are a team that have historically bounced between the third and fourth tiers (and a few seasons in the second tier). In fact, any people associate Peterborough by their nickname “the Posh” and the resulting lawsuit with David Beckham’s wife over the clubs decision to trademark the term “Posh” after Victoria’s rise to fame. Their appearance at the top of this list compelled me to take a deeper look into their FA Cup history, which includes a run to the quarterfinals in 1965, and an appearance in the 5th round in 1975, 1981 and 1986. This history is underwhelming, but they have constantly found themselves qualifying for the 3rd or 4th round in most seasons, something that is not a given for teams in the bottom two tiers of the Football League.



Now, we’ll take a look at the ten teams who have performed most poorly compared to what their league form would suggest.



This list includes some of the teams we might have expected from this. Newcastle United are notorious for performing badly in the cup competitions, and despite their size and success in the league, they have not won the FA Cup since 1955 (and have reached three finals since) and were famously knocked out by non league Hereford United in 1972 (it’s impossible to write an article about the 3rd round of the FA Cup without finding some way of sneaking that Ronnie Radford goal in there). Liverpool are also a surprising feature on this list, given that they’ve won the cup seven times in the 57 years that we used to test this data, but have relatively underperformed in recent years and have suffered many unexpected early round exits. Aston Villa are another classic example on this list, as they’ve won almost as many European Cups (1) as they have FA Cup semi-final appearances (2).



Overall, it was very interesting to see which teams can be considered cup specialists by our methodology, and our results were somewhat in line with our expectation, with certain teams appearing high up both lists that were relatively unexpected.

Special thanks to James Curley for compiling and sharing the R package engsoccerdata, which was used for both the league matches to build the ELO scores and the FA Cup data that we used to build the ordered logit model.



If you have any questions for Andrew, please feel free to reach out to him by email at andrewpuopolo@college.harvard.edu or on Twitter @andrew_puopolo.



Ordered Logit Model Output:



Share this: Twitter

Facebook

Reddit

LinkedIn

Google



Like this: Like Loading...