Guess who!

Named after Charles Brownlow, the Brownlow medal is awarded to the player judged the “best and fairest” over the duration of an AFL football home and away season. The Brownlow is presented on the Monday evening before the grand final and is the crowning social event on the AFL calendar. Betting on the Brownlow is a great Australian pastime, perhaps not of the same magnitude as the Melbourne Cup, but definitely with a similarly high percentage of punters losing money on frivolous betting.

Brendon Fevola has a long history at the Brownlow Medal count, and currently, hands out gambling advice for OddsChecker

The aim of this experiment was to try and use statistics to predict the Brownlow medal. In the process we would be taking out the human component in betting – the part of you which always wants to have a flutter on Sam Mitchell or Kepler Bradley – just in case. We will try and keep it relatively maths free, with most of the theory provided with links to well-explained pages. If you are really impatient, you can skip all the way to the bottom and look at the predictions.

Human bias is almost impossible to avoid, in particular in relation to gambling. There is a reason why running a book is an ancient and successful way of making money. See here for a good summary of why humans typically suck at gambling. The Brownlow medal is no exception, in fact, it’s likely worse due to factors such as:

The wide field of contenders (Roughly 300 players per week, 9 games, 23 rounds = A lot of data)

The long period of time between round 1 and Brownlow night: people need to summarize half a year into what is essentially a complex probability equation. It will invariably be skewed towards the second half of the season and even finals (which obviously don’t count).

The inherent reliance on the rationality of umpires. We don’t need to go far past the James Hird debacle of 2004 for evidence that the system can be flawed. Hird had publicly criticized umpire Scott McClaren during the week, leading to fines and public shaming. He went on to have one of the best games of his career against the West Coast Eagles that weekend, picking up 33 disposals, including 14 in the last quarter along with three goals (You probably remember this). He received no votes for the game and umpires confirmed to no one that they hold grudges.

How the Brownlow Voting works:

The process for distribution of votes for the Brownlow is a simple and time honoured decision. Voting is carried out by field umpires immediately after games, with 3, 2 and 1 votes distributed after consensus is reached. No statistics are used in the process which has been carried out in this way since 1930 (with a small hiatus in 1976-1978 where two field umpires both voted).

Shane Woewodin was the recipient of the 2000 Brownlow medal.

To the ire of some media commentators, the Brownlow medal has been historically dominated by midfielders. Glenn Mitchell for the Roar concisely summarized and discussed the breakdown here. The medal has been presented in 85 seasons and, accounting for ties, a total of 98 medals have been awarded. Of these, 61 have been won by mid-fielders (centremen, wingmen, ruck-rovers and rovers), and another 19 by ruckmen. This leaves 18 medals to be shared by forwards and defenders (who make up 66% of the players on the field at any given time). In recent years, midfielders’ domination of the medal has increased, with only one non-midfielder winning the award since 1996 (Adam Goodes, playing predominately in the ruck in 2003. Goodes, however, went on to prove he was no ordinary ruckman when he won the award again in 2006 playing mostly as a wingman). In 2015, only 2 non-centre players bucked the trend to finish in the top 20; Todd Goldstein at 10 and Aaron Sandilands at 15. The highest ranked forward was Jeremy Cameron from the GWS giants with 12 votes (19 behind medal winner Nat Fyfe).

While the playing position of the Brownlow winner is relatively predictable, accurate prediction of the eventual vote count is a much more difficult proposition. There is little publicly available literature on prediction of Australian sports in general, much less so the Brownlow medal. The official AFL website runs an Brownlow Predictor, updated weekly throughout the year and the guys at Phantom run a great blog dedicated to this very question. The AFL website doesn’t reveal their prediction mechanisms, and Phantom uses the average results of “Expert Voting” for theirs. Only two researchers (that we can find) have publicly investigated the prediction of the Brownlow using statistical methods – Michael Bailey in 2005, and Robert Nguyen in 2014 (Unpublished, but links to his work in the media here and here).

A snapshot of Michael Bailey’s predictions from 2000 and 2001. (He got Shane Woewodin drastically wrong, but we can’t hold that against him)

As part of his thesis for Swinburne University in 2005, Michael investigated the potential for predicting the Brownlow using an ordinal logistical regression model and the standard suite of statistics that champion data doesn’t hog (and obtain themselves, in fairness). An ordinal logistical regression is a relatively simple prediction technique, summarised very elegantly here. Bailey essentially identified the optimal combination of statistical variables for predicting the likelihood of a player receiving 3, 2 or 1 votes in a match.

There was a lot of work that went into Michael’s model and anyone interested in this field should definitely read the entire thesis. Three important and interesting points to come out of it were:

Players with distinctive appearances were twice as likely to vote as a “non-descript” person (0.24 votes per game instead of 0.12 votes per game). For his thesis, Michael described distinctive appearance as any player with red or blond hair, or with significantly darker or lighter skin. Had he done his thesis more recently, he would have definitely included shaved heads, heavy tattooing and what I call “Mitch Robinson Head”.

It is statistically difficult to predict the top ten’s vote count within a reasonable accuracy. However, for the remainder of players, the problem becomes much more reliable. To quote Michael: “By accurately predicting 66% of players to within one vote of their actual total, and 90% of players to within three votes of their total, the modelling process provides an objective assignment of probabilities that has many benefits.”

The prior 3-4 years is the optimal training period, providing the most accurate prediction in any given year. If trained for longer, the performance of the model actually decreased. This is likely due to numerous factors: changes in game style, media attention or even implicit directions by the AFL to punish certain football clubs for at best circumstantial offenses*.

In general, the approach we used was similar to Michael Bailey’s previous methods. An analysis of variable importance was carried out, as well as experiments into the ideal length of training data (how many seasons to use). We did not use the distinctive features variable as it seems that it might add some subjective bias, and was also beyond the limits of our lazy meter. We incorporated the use of Dream Team scores as a variable, which applied a scaling of importance to particular variables (6 points for a goal, 4 for a tackle and so on). Dream Team score actually ended up being the most important variable, which would come as no surprise to those who are avid fantasy football coaches.

An equalised histogram showing the likelihood of scoring a Brownlow Vote relative to Dream Team Score. The blue bars represent a count of players who scored votes, and the red players who did not. The points with error bars show the likelihood of a vote or a non-vote occurring. You will note that at around 125 Dream Team points the chance of you scoring votes goes above 50% – i.e. you are more likely than not.

The above image shows the relative importance of each input variable for predicting Brownlow Votes. Unsurprisingly, Dream Team score is the strongest predictor and how many free kicks against a player gets is of no influence. Interestingly how many free kicks for a player gets is more important than contested marks – Hello Joel Selwood.

While a logistical regression is a fairly simple technique, Michael Bailey still managed to predict Brownlow votes to a reasonable accuracy, particularly for lower vote-getters We decided to trial the use of some more complex prediction algorithms to determine whether we could at least emulate Bailey’s work, and potentially improve it. Using the common machine learning technique, Random Forest, and publicly available statistics data (1,2) we attempted to predict the 2016 Brownlow Medal.

A very brief summary of the theory:A Random Forest is basically a more complex, iterative version of the easier to understand decision tree. A decision tree works out a set of “questions” to ask the data, to determine the optimal way to split the training data into categories – in this case, “No Votes” or “Votes”. The example below is an extremely simple version of a decision tree – the Random Forest would do thousands of these using different subsets of training data and variables to calculate the most robust prediction model possible. Once a set of “rules” has been established, test data (i.e. 2016 data) can be fed into the algorithm. See here and here for more detailed and excellent descriptions of how a Random Forest works.

An example (and potentially over-simplified) decision tree showing the underlying logic behind the Random Forest Algorithm.

The algorithm was tested over several years to ensure it was predicting within an acceptable range. In general, our findings were consistent with Bailey in that the algorithm was much more accurate in the lower ranked players than the top 10-20. We have included the results of that in a link at the bottom, but for interest’s sake, here is the top 20 from last year:

2015 Modeled Vote Distributions versus Actual Results (Top 20 Players). For an example of how to read a Box Plot, click here.

Comments on 2015:

The model picked Nat Fyfe and Matt Priddis to be going head to head at the end, which turned out to be correct. Fyfe had a tighter range of possible scores than Priddis, but Priddis had a higher average.

Lachie Neale is apparently statistically a Brownlow gun, but the umpires haven’t realised yet. His differential of 13 votes is the biggest difference between predicted and modelled in the entire prediction.

Most bookies had Scott Pendlebury to poll more votes than Dane Swan. If the model had been followed, there was money to be made there. In addition, the model nailed Pendlebury’s predicted votes.

The Model correctly identified that Zak Dawson would not poll a vote.

We were fairly happy with this result, so we included last year’s data in the training algorithm and ran the numbers on 2016 statistics. The expected number one vote-getter should surprise no one – Patrick Dangerfield finished on top in no less than 100% of the iterations, by a minimum of 7 votes. For this model to be correct, Dangerfield would need to obtain the most votes ever by a player in a season, so there is room for skepticism, however what is clear is that he has had a statistically superior year over all his peers.

Without further ado, we humbly put our 2016 Brownlow Medal Top 20 predictions in the public domain:

2016 Modeled Vote Distributions (Top 20 Players).

Comments on 2016 (More to come as the day approaches):

Patrick Dangerfield has had an amazing year. Put your house on it. Put your neighbours house on it. You might only make 30 bucks profit, but it’s safe money**.

Sam Mitchell and Joel Selwood were (at the time of writing), both 11 dollars to win the Brownlow in the Danger free market at Sportsbet. Sam Mitchell seems to have the most upside of the two, particularly when you take into account the Danger effect.

It is worth reiterating, that the value in using an algorithm like this is in the players who score between 10 and 20 votes. The error in the model drops significantly after the top 10 to 15 players. With this in mind, markets such as “Most at the club” or “Will this player improve on their total from last year” are likely to provide the best value.

We will report back shortly after the Brownlow to either bathe in glory, or eat humble pie. If anyone picks up any errors, has any suggestions or questions, wants different data or just to chat about methods please hit us up. The main reason behind creating this blog was to engage different people in the community and develop a dialogue in this niche area of interest. Also, none of our friends will listen to us anymore and we need a new audience.

Detailed results can be found here:

*Allegedly

** Do NOT make any bets using this information that you aren’t prepared to lose. This is a summary of an algorithm and the results – not gambling advice. D.Y.O.R.

Notes: