Reflecting on 2017, I decided to return to my most popular blog topic (at least by the number of emails I get). Last time, I built a crude statistical model to predict the result of football matches. I even presented a webinar on the subject here (it’s free to sign up). During the presentation, I described a coefficient in the model that accounts for the fact that the home team tends to score more goals than the away team. This is called the home advantage (or home field advantage) and can probably be explained by a combination of physcological (e.g. familiarity with surroundings) and physical factors (e.g. travel). It occurs in various sports, including American football, baseball, basketball and soccer. Sticking to soccer/football, I mentioned in my talk how it would be interesting to see how this effect varies around the world. In which countries do the home teams enjoy the greatest advantage?

We’re going to use the same statistcal model as last time, so there won’t be any new statistical features developed in this post. Instead, it will focus on retrieving the appropriate goals data for even the most obscure leagues in the world (yes, even the Irish Premier Division) and then interactively visualising the results with D3. The full code can be found in the accompanying Jupyter notebook.

Calculating Home Field Advantage

The first consideration should probably be how to calculate home advantage. The traditional approach is to look at team matchups and check whether teams achieved better, equal or worse results at home than away. For example, let’s imagine Chlesea beat Arsenal 2-0 at home and drew 1-1 away. That would be recored as a better home result (+2 goals versus 0). This process is repeated for every opponent and so you can actually construct a trinomial distribution and test whether there was a statistically significant home field effect. This works for balanced leagues, where team play each other an equal number of times home and away. While this holds for Europe’s most famous leagues (e.g. EPL, La Liga), there are various leagues where teams play each other threes times (e.g. Ireland, Montenegro, Tajikistan aka The Big Leagues) or even just once (e.g Argetnina, Libya and to a lesser extent MLS (balanced for teams within the same conference)). There’s also issues with postponements and abandonments rendering some leagues slightly unbalanced (e.g. Sri Lanka). For those reasons, we’ll opt for a different (though not necessarily better) approach.

In the previous post, we built a model for the EPL 2016/17 season, using the number of goals scored in the past to predict future results. Looking at the model coefficients again, you see the home coefficient has a value of approximately 0.3. By taking the exponent of this value ( ), it tells us that the home team are generally 1.35 times more likely to score than the away team. In case you don’t recall, the model accounts for team strength/weakness by including coefficients for each team (e.g 0.07890 and -0.96194 for Chelsea and Sunderland, respectively).

Let’s see how this value compares with the lower divisions in England over the past 10 years. We’ll pull the data from football-data.co.uk, which can loaded in directly using the url link for each csv file. First, we’ll design a function that will take a dataframe of match results as an input and return the home field advantage (plus confidence interval limits) for that league.

# importing the tools required for the Poisson regression model import statsmodels.api as sm import statsmodels.formula.api as smf import pandas as pd import matplotlib.pyplot as plt import numpy as np import seaborn def get_home_team_advantage ( goals_df , pval = 0.05 ): # extract relevant columns model_goals_df = goals_df [[ 'HomeTeam' , 'AwayTeam' , 'FTHG' , 'FTAG' ]] # rename goal columns model_goals_df = model_goals_df . rename ( columns = { 'FTHG' : 'HomeGoals' , 'FTAG' : 'AwayGoals' }) # reformat dataframe for the model goal_model_data = pd . concat ([ model_goals_df [[ 'HomeTeam' , 'AwayTeam' , 'HomeGoals' ]] . assign ( home = 1 ) . rename ( columns = { 'HomeTeam' : 'team' , 'AwayTeam' : 'opponent' , 'HomeGoals' : 'goals' }), model_goals_df [[ 'AwayTeam' , 'HomeTeam' , 'AwayGoals' ]] . assign ( home = 0 ) . rename ( columns = { 'AwayTeam' : 'team' , 'HomeTeam' : 'opponent' , 'AwayGoals' : 'goals' })]) # build poisson model poisson_model = smf . glm ( formula = "goals ~ home + team + opponent" , data = goal_model_data , family = sm . families . Poisson ()) . fit () # output model parameters poisson_model . summary () return np . concatenate (( np . array ([ poisson_model . params [ 'home' ]]), poisson_model . conf_int ( alpha = pval ) . values [ - 1 ]))

I’ve essentially combined various parts of the previous post into one convenient function. If it looks a little strange, then I suggest you consult the original post. Okay, we’re ready to start calculating some home advantage scores.

# home field advantage for EPL 2016/17 season get_home_team_advantage ( pd . read_csv ( "http://www.football-data.co.uk/mmz4281/1617/E0.csv" ))

array([ 0.2838454, 0.16246 , 0.4052308])

It’s as easy as that. Feed a url from football-data.co.uk into the function and it’ll quickly tell you the statistical advantage enjoyed by home teams in that league. Note that the latter two values repesent the left and right limit of the 95% confidence interval around the mean value. The first value in the array is actually just the log of the number of goals scored by the home team divided by the total number of away goals.

temp_goals_df = pd . read_csv ( "http://www.football-data.co.uk/mmz4281/1617/E0.csv" ) [ np . exp ( get_home_team_advantage ( temp_goals_df )[ 0 ]), np . sum ( temp_goals_df [ 'FTHG' ]) / float ( np . sum ( temp_goals_df [ 'FTAG' ]))]

[1.3282275711159723, 1.3282275711159737]

The goals ratio calculation is obviously much simpler and definitely more intuitive. But it doesn’t allow me to reference my previous post as much (link link link) and it fails to provide any uncertainty around the headline figure. Let’s plot the home advantage figure for the top 5 divisions of the English league pyramid for since 2005. You can remove those hugely informative confidence interval bars by switching the toggle.

Error Bars



It’s probably more apparent without those hugely informative confidence interval bars, but it seems that the home advantage score decreases slightly as you move down the pyramid (analysis by Sky Sports produced something similar). This might make sense for two reasons. Firstly, bigger teams generally have larger stadiums and more supporters, which could strengthen the home field advantage. Secondly, as you go down the leagues, I suspect the quality gap between teams narrows. Taking it to an extreme, when I used to play Sunday league football, it didn’t really matter where we played… we still lost. In that sense, one must be careful comparing the home advantage between leagues, as it will be affected by the relative team strengths within those leagues. For example, a league with a very dominant team (or teams) will record a lower home advantage score, as that dominant team will score goals home and away with little difference (Man Utd would probably beat Cork City 6-0 at Old Trafford and Turners Cross!).

Having warned about the dangers of comparing different leagues with this approach, let’s now compare the top five leagues in Europe over the same time period as before.

Error Bars



Honestly, there’s not much going on there. With the poissble exception of the Spanish La Liga since 2010, the home field advantage enjoyed by the teams in each league is broadly similar (and that’s before we bring in the idea of confidence intervals and hypothesis testing).

Home Advantage Around the World

To find more interesting contrasts, we must venture to crappier and more corrupt leagues. My hunch is that home advantage would be negligible in countries where the overall quality (team, infastructure, etc.) is very low. And by low, I mean leagues worse than the Irish Premier Division (yes, they exist). Unfortunately, the historical results for such leagues are not available on football-data.co.uk. Instead, we’ll scrape the data off betexplorer. I’m extremely impressed by the breadth of this site. You can even retrieve past results for the French overseas department of Réunion. Fun fact: Dimtri Payet spent the 2004 season at AS Excelsior of the Réunion Premier League.

We’ll use Scrapy to pull the appropriate information off the website. If you’ve never used Scrapy before, then you should check out this post. I won’t spend too long on this part, but you can find the full code here.

You don’t actually need to run your own spider, as I’ve shared the output to my GitHub account. We can import the json file in directly using pandas.

all_league_goals = pd . read_json ( "https://raw.githubusercontent.com/dashee87/blogScripts/master/files/all_league_goals.json" ) # reorder the columns to it a bit more logical all_league_goals = all_league_goals [[ 'country' , 'league' , 'date' , 'HomeTeam' , 'AwayTeam' , 'FTHG' , 'FTAG' , 'awarded' ]] all_league_goals . head ()

country league date HomeTeam AwayTeam FTHG FTAG awarded 0 Albania Super League 2016/2017 2017-05-27 Korabi Peshkopi Flamurtari 0 3 False 1 Albania Super League 2016/2017 2017-05-27 Laci Teuta 2 1 False 2 Albania Super League 2016/2017 2017-05-27 Luftetari Gjirokastra Kukesi 1 0 False 3 Albania Super League 2016/2017 2017-05-27 Skenderbeu Partizani 2 2 False 4 Albania Super League 2016/2017 2017-05-27 Vllaznia KF Tirana 0 0 False

Hopefully, that’s all relatively clear. You’ll notice that it’s very similar to the format used by football-data, which means that we can feed this dataframe into the get_home_team_advantage function. Sometimes, matches are awarded due to one team fielding an ineligible player or crowd trouble. We should probably exclude such matches from the home field advantage calculations.

# little bit of data cleansing to remove fixtures that were abandoned/awarded/postponed all_league_goals = all_league_goals [ ~ all_league_goals [ 'awarded' ]] all_league_goals = all_league_goals [ all_league_goals [ 'FTAG' ] != 'POSTP.' ] all_league_goals = all_league_goals [ all_league_goals [ 'FTAG' ] != 'CAN.' ] all_league_goals [[ 'FTAG' , 'FTHG' ]] = all_league_goals [[ 'FTAG' , 'FTHG' ]] . astype ( int )

We’re ready to put it all together. I’ll omit the code (though it can be found here), but we’ll loop through each country and league combination (just in case you decide to include multiple leagues from the same country) and calculate the home advantage score, plus its confidence limits as well as some other information for each league (number of teams, average number of goals in each match). I’ve converted the pandas output to a datatables table that you can interactively filter and sort.

country league # games # teams avg_goals home_adv score left_tail right_tail 1 Nigeria Premier League 2017 379 20 2.011 1.195 1.027 1.363 2 Haiti Championnat National 2017 237 16 1.717 0.741 0.533 0.949 3 Algeria Ligue 1 2016/2017 238 16 2.092 0.698 0.512 0.884 4 Ghana Premier League 2017 238 16 2.202 0.676 0.494 0.857 5 Bolivia Liga de Futbol Prof 2016/2017 132 12 3.432 0.624 0.431 0.818 6 Guatemala Liga Nacional 2016/2017 264 12 2.155 0.620 0.448 0.792 7 Benin Championnat National 2017 162 19 1.778 0.571 0.330 0.811 8 USA MLS 2017 374 22 2.968 0.538 0.416 0.660 9 Peru Primera Division 2017 238 16 2.681 0.520 0.359 0.680 10 Indonesia Liga 1 2017 304 18 2.888 0.515 0.378 0.651 11 Togo Championnat National 2016/2017 181 14 1.934 0.510 0.293 0.726 12 Uzbekistan Professional Football League 2017 233 16 2.571 0.503 0.338 0.668 13 Mozambique Mocambola 2017 240 16 1.867 0.501 0.310 0.692 14 Angola Girabola 2017 239 16 2.151 0.499 0.321 0.678 15 Greece Super League 2016/2017 240 16 2.317 0.499 0.328 0.671 16 Tunisia Ligue Professionnelle 1 2016/2017 112 16 2.098 0.495 0.231 0.759 17 Albania Super League 2016/2017 180 10 1.889 0.488 0.269 0.707 18 Sudan Premier League 2017 306 18 2.261 0.486 0.332 0.639 19 Tanzania Ligi Kuu Bara 2016/2017 239 16 1.971 0.480 0.294 0.665 20 Colombia Liga Aguila 2017 400 20 2.145 0.465 0.328 0.603 21 Ecuador Serie A 2017 263 12 2.605 0.454 0.300 0.608 22 Honduras Liga Nacional 2016/2017 180 10 2.828 0.452 0.273 0.630 23 Ethiopia Premier League 2016/2017 239 16 1.837 0.433 0.241 0.625 24 Morocco Botola Pro 2016/2017 240 16 2.229 0.405 0.232 0.578 25 India I-League 2017 90 10 2.500 0.405 0.139 0.672 26 Montenegro Prva Crnogorska Liga 2016/2017 197 12 2.020 0.398 0.196 0.599 27 Croatia 1. HNL 2016/2017 180 10 2.417 0.396 0.204 0.588 28 Zimbabwe Premier Soccer League 2017 305 18 2.023 0.389 0.228 0.549 29 Kosovo Superliga 2016/2017 196 12 2.383 0.386 0.200 0.571 30 Sierra Leone Premier League 2014 90 14 1.833 0.376 0.060 0.691 31 France Ligue 1 2016/2017 379 20 2.615 0.375 0.248 0.502 32 Malawi Super League 2017 239 16 2.331 0.372 0.203 0.541 33 Costa Rica Primera Division 2016/2017 264 12 2.689 0.370 0.221 0.520 34 Norway Eliteserien 2017 240 16 2.842 0.368 0.215 0.520 35 Bulgaria Parva Liga 2016/2017 182 14 2.467 0.365 0.177 0.553 36 Russia Premier League 2016/2017 240 16 2.133 0.363 0.187 0.539 37 Kazakhstan Premier League 2017 198 12 2.465 0.361 0.180 0.542 38 Belgium Jupiler League 2016/2017 239 16 2.736 0.359 0.203 0.515 39 FYR of Macedonia First League 2016/2017 180 10 2.539 0.349 0.163 0.535 40 Senegal Ligue 1 2016/2017 181 14 2.204 0.348 0.149 0.547 41 Azerbaijan Premier League 2016/2017 111 8 2.234 0.346 0.094 0.599 42 Moldova Divizia Nationala 2016/2017 165 11 2.539 0.341 0.147 0.536 43 Slovakia Fortuna liga 2016/2017 184 12 2.690 0.339 0.159 0.518 44 Cameroon Elite One 2017 303 18 1.795 0.337 0.166 0.508 45 Jamaica Premier League 2016/2017 198 12 2.192 0.336 0.145 0.527 46 RÃ©union Regionale 1 2017 182 14 2.610 0.336 0.153 0.518 47 Venezuela Primera Division 2017 303 18 2.482 0.331 0.186 0.476 48 Portugal Primeira Liga 2016/2017 306 18 2.379 0.327 0.180 0.474 49 South Africa Premier League 2016/2017 240 16 2.242 0.322 0.151 0.494 50 Germany Bundesliga 2016/2017 306 18 2.866 0.315 0.181 0.449 51 Uganda Premier League 2016/2017 237 16 2.135 0.313 0.137 0.490 52 Guinea Ligue 1 2016/2017 181 14 2.044 0.312 0.105 0.519 53 Thailand Thai Premier League 2017 306 18 3.389 0.309 0.186 0.432 54 Yemen Division 1 2013/2014 180 14 2.322 0.308 0.114 0.503 55 Zambia Super League 2017 379 20 2.003 0.308 0.164 0.452 56 Kyrgyzstan Top Liga 2017 60 6 2.950 0.307 0.009 0.606 57 Hungary OTP Bank Liga 2016/2017 198 12 2.631 0.307 0.133 0.481 58 Namibia MTC Premiership 2015/2016 240 16 2.412 0.304 0.139 0.469 59 China Super League 2017 240 16 3.050 0.303 0.156 0.449 60 Niger Ligue 1 2016/2017 181 14 2.171 0.301 0.101 0.502 61 Iraq Super League 2016/2017 354 20 2.110 0.299 0.153 0.446 62 Netherlands Eredivisie 2016/2017 306 18 2.889 0.296 0.163 0.430 63 Serbia Super Liga 2016/2017 239 16 2.364 0.290 0.124 0.457 64 Palestine West Bank League 2016/2017 131 12 2.450 0.287 0.066 0.508 65 England Premier League 2016/2017 380 20 2.800 0.284 0.162 0.405 66 Gabon Championnat D1 2016/2017 163 14 2.307 0.282 0.077 0.486 67 Brazil Serie A 2017 380 20 2.429 0.281 0.151 0.412 68 Turkmenistan Yokary Liga 2017 143 9 2.916 0.278 0.084 0.472 69 Spain LaLiga 2016/2017 380 20 2.942 0.263 0.144 0.381 70 Poland Ekstraklasa 2016/2017 240 16 2.767 0.260 0.107 0.414 71 Czech Republic 1. Liga 2016/2017 240 16 2.488 0.259 0.098 0.421 72 Italy Serie A 2016/2017 379 20 2.955 0.257 0.139 0.375 73 Wales Premier League 2016/2017 132 12 2.970 0.246 0.047 0.446 74 New Zealand Football Championship 2016/2017 90 10 3.567 0.244 0.024 0.465 75 Republic of the Congo Ligue 1 2017 300 18 2.237 0.244 0.091 0.397 76 Kenya Premier League 2017 304 18 2.026 0.244 0.084 0.403 77 Ukraine Pari-Match League 2016/2017 132 12 2.462 0.241 0.022 0.460 78 Austria Tipico Bundesliga 2016/2017 180 10 2.711 0.239 0.060 0.418 79 Switzerland Super League 2016/2017 180 10 3.233 0.235 0.071 0.398 80 Mexico Primera Division 2016/2017 306 18 2.634 0.234 0.095 0.373 81 Turkey Super Lig 2016/2017 305 18 2.708 0.231 0.094 0.369 82 Bosnia and Herzegovina Premier League 2016/2017 132 12 2.242 0.231 0.001 0.460 83 Romania Liga 1 2016/2017 181 14 2.376 0.223 0.033 0.413 84 Philippines PFL 2017 109 8 3.202 0.218 0.007 0.430 85 Malaysia Super League 2017 132 12 3.091 0.217 0.021 0.412 86 Australia A-League 2016/2017 135 10 3.030 0.213 0.018 0.409 87 DR Congo Super Ligue 2016/2017 195 26 2.205 0.198 0.006 0.391 88 Syria Premier League 2016/2017 239 16 2.180 0.197 0.025 0.370 89 Argentina Primera Division 2016/2017 450 30 2.276 0.195 0.071 0.318 90 Burundi Ligue A 2016/2017 239 16 2.138 0.192 0.018 0.366 91 Cyprus First Division 2016/2017 182 14 2.879 0.191 0.019 0.363 92 Sweden Allsvenskan 2017 240 16 2.779 0.189 0.037 0.342 93 Tajikistan Vysshaya Liga 2017 84 8 2.702 0.189 -0.077 0.456 94 Northern Ireland NIFL Premiership 2016/2017 195 12 2.933 0.188 0.023 0.353 95 Scotland Premiership 2016/2017 198 12 2.687 0.187 0.016 0.358 96 Saudi Arabia Saudi Professional League 2016/2017 182 14 3.016 0.186 0.018 0.354 97 Iceland Pepsideild 2017 132 12 3.053 0.184 -0.012 0.380 98 Nicaragua Primera Division 2016/2017 179 10 3.156 0.182 0.016 0.347 99 Denmark Superliga 2016/2017 182 14 2.632 0.180 0.000 0.360 100 Lesotho Premier League 2016/2017 180 14 2.428 0.180 -0.009 0.368 101 Vietnam V-League 2017 182 14 2.912 0.174 0.003 0.345 102 Rwanda National Football league 2016/2017 239 16 2.134 0.174 -0.001 0.348 103 Ireland Premier Division 2017 198 12 2.773 0.172 0.004 0.341 104 Estonia Meistriliiga 2017 180 10 3.656 0.171 0.017 0.324 105 United Arab Emirates UAE League 2016/2017 182 14 3.137 0.165 0.000 0.330 106 El Salvador Primera Division 2016/2017 263 12 2.601 0.161 0.010 0.311 107 Luxembourg National Division 2016/2017 182 14 3.319 0.153 -0.007 0.313 108 Bangladesh Premier League 2016 132 12 2.591 0.152 -0.060 0.365 109 Mauritania Championnat D1 2016/2017 181 14 2.453 0.149 -0.038 0.336 110 Swaziland MTN Premier League 2016/2017 131 12 2.634 0.146 -0.065 0.358 111 Trinidad and Tobago Pro League 2017 90 10 2.922 0.145 -0.098 0.387 112 Malta Premier League 2016/2017 197 12 2.878 0.145 -0.021 0.310 113 Chile Primera Division 2016/2017 120 16 2.892 0.145 -0.068 0.357 114 Israel Ligat ha'Al 2016/2017 182 14 2.132 0.145 -0.055 0.344 115 Botswana Premier League 2016/2017 240 16 2.317 0.144 -0.023 0.311 116 Oman Professional League 2016/2017 182 14 2.758 0.136 -0.040 0.311 117 Iran Persian Gulf Pro League 2016/2017 240 16 2.100 0.135 -0.040 0.310 118 Bermuda Premier League 2016/2017 90 10 3.144 0.134 -0.099 0.368 119 Lithuania A Lyga 2017 112 8 2.580 0.132 -0.099 0.363 120 Egypt Premier League 2016/2017 305 18 2.256 0.122 -0.027 0.272 121 Faroe Islands Premier League 2017 134 10 3.187 0.122 -0.068 0.312 122 Burkina Faso Premier League 2016/2017 240 16 1.721 0.121 -0.072 0.314 123 Finland Veikkausliiga 2017 198 12 2.737 0.120 -0.049 0.289 124 Seychelles Division One 2017 132 12 3.303 0.114 -0.075 0.302 125 Japan J-League 2017 306 18 2.592 0.114 -0.026 0.253 126 Myanmar National League 2017 128 12 2.594 0.113 -0.104 0.329 127 Ivory Coast Ligue 1 2016/2017 182 14 1.802 0.110 -0.107 0.327 128 Georgia Erovnuli Liga 2017 179 10 2.810 0.108 -0.067 0.283 129 Qatar Premier League 2016/2017 182 14 3.132 0.105 -0.059 0.270 130 South Korea K-League Classic 2017 198 12 2.737 0.102 -0.067 0.271 131 Slovenia Prva liga 2016/2017 180 10 2.572 0.099 -0.083 0.282 132 Lebanon Premier League 2016/2017 131 12 2.771 0.094 -0.113 0.300 133 San Marino Campionato Sammarinese 2016/2017 154 15 3.143 0.085 -0.095 0.264 134 Belarus Vysshaya Liga 2017 240 16 2.333 0.071 -0.094 0.237 135 Mali Premiere Division 2016 162 19 2.031 0.064 -0.153 0.281 136 Gibraltar Premier Division 2016/2017 132 10 3.288 0.064 -0.126 0.253 137 Hong Kong Premier League 2016/2017 110 11 3.427 0.058 -0.144 0.260 138 Singapore S.League 2017 108 9 2.981 0.056 -0.163 0.275 139 Sri Lanka Champions League 2017 143 18 3.266 0.051 -0.133 0.235 140 Cape Verde Campeonato Nacional 2017 36 12 2.389 0.047 -0.376 0.469 141 Djibouti Division 1 2016/2017 90 10 3.978 0.045 -0.163 0.252 142 Uruguay Primera Division 2017 240 16 2.729 0.034 -0.120 0.187 143 Gambia GFA League 2016/2017 131 12 1.908 0.032 -0.218 0.281 144 Canada CSL 2017 56 8 4.304 0.025 -0.228 0.277 145 Armenia Premier League 2016/2017 90 6 2.200 0.020 -0.258 0.299 146 Panama LPF 2016/2017 180 10 2.083 0.005 -0.197 0.208 147 Kuwait Premier League 2016/2017 210 15 3.048 -0.000 -0.155 0.155 148 Mauritius Mauritian League 2016/2017 179 10 2.922 -0.001 -0.173 0.170 149 Andorra Primera DivisiÃ³ 2016/2017 83 8 3.265 -0.005 -0.245 0.236 150 Latvia SynotTip VirslÄ«ga 2017 96 8 2.417 -0.006 -0.264 0.251 151 Libya Premier League 2017 83 28 2.265 -0.015 -0.324 0.293 152 Dominican Republic LDF 2017 90 10 2.567 -0.021 -0.280 0.239 153 Cambodia C-League 2017 132 12 3.864 -0.031 -0.205 0.142 154 Paraguay Primera Division 2017 264 12 2.534 -0.033 -0.184 0.119 155 Jordan Premier League 2016/2017 132 12 2.235 -0.034 -0.262 0.194 156 Bahrain Premier League 2016/2017 90 10 2.556 -0.048 -0.310 0.213 157 Pakistan Premier League 2014/2015 132 12 2.333 -0.065 -0.288 0.159 158 Liberia LFA First Division 2016/2017 125 12 2.248 -0.089 -0.323 0.145 159 Somalia Nation Link Telecom Championship 2016/2017 90 10 2.922 -0.153 -0.396 0.090 160 Maldives Dhivehi Premier League 2017 56 8 3.304 -0.370 -0.782 0.042

Focusing on the home_adv score column, teams in Nigeria by far enjoy the greatest benefit from playing at home (score = 1.195). In other words, home teams scored 3.3 (= ) times more goals than their opponents. This isn’t new information and can be attributed to a combination of corruption (e.g. bribing referees) and violent fans. In fact, my motivation for this post was to identify more football corruption hotspots. Alas, when it comes to home turf invincibility, it seems Nigeria are the World Cup winners.

Fifteen leagues have a negative home_advantage_score , meaning that visiting teams actually scored more goals than their hosts- though none was statistically significant. By some distance, the Maldives records the most negative score. Luckily, I’ve twice researched this beautiful archipelago and I’m aware that all matches in the Dhiveli Premier League are played at the national stadium in Malé (much like the Gibraltar Premier League). So it would make sense that there’s no particular advantage gained by the home team. Libya is another interesting example. Owing to security issues, all matches in the Libyan Premier League are played in neutral venues with no spectators present. Quite fittingly, it returned a home advantage score just off zero. Generally speaking, the leagues with near zero home advantage come from small countries (minimal inconvenience for travelling teams) with a small number of teams and they tend to share stadiums.

If you sort the avg_goals column, you’ll see the semi-pro Canadian Soccer League is the place to be for goals (average = 4.304). But rather than sifting through that table or explaining the results with words, the most intuitive way to illustrate this type of data is with a map of world. This might also help to clarify whether there’s any geographical influence on the home advantage effect. Again, I won’t go into the details (an appendix can be found in the Jupyter notebook), but I built a map using the JavaScript library, D3. And by built I mean I adapted the code from this post and this post. Though a little outdated now, I found this post quite useful too. Finally, I think this post shows off quite well what you can do with maps using D3.

And here it is! The country colour represents its home_advantage_score . You can zoom in and out and hover over a country to reveal a nice informative overlay; use the radio buttons to switch between home advantage and goals scored. I recommend viewing it on desktop (mobile’s a bit jumpy) and on Chrome (sometimes have security issues with Firefox).

It’s not scientifically rigorous (not in academia any more, baby!), but there’s evidence for some geographical trends. For example, it appears that home advantage is stronger in Africa and South America compared to Western and Central Europe, with the unstable warzones of Libya, Somalia and Paraguay (?) being notable exceptions. As for average goals, Europe boasts stonger colours compared to Africa, though South East Asia seems to be the global hotspot for goals. North America is also quite dark, but you can debate whether Canada should be coloured grey, as the best Canadian teams belong to the American soccer system.

Conclusion

Using a previously described model and some JavaScript, this post explored the so called home advantage in football leagues all over the world (including Réunion). I don’t think it uncovered anything particularly amazing: different leagues have different properties and don’t bet on the away team in the Nigerian league. You can play around with the Python code here. Thanks for reading!