In a Machine Learning’s meetup, we were talking about applying probability to sports. The question was that football has a small amount of goals which makes it hard for prediction. On the other hand, sports like Handball or Basketball have a large amount of goals, letting the errors cancel each other. I’ve already read Nate Silver’s book The Singal and the Noise, but I want to do the simplest model for prediction. Watching my girlfriend playing handball, I thought that I can predict — and give a confident interval of— the final score using the information given at the minute 20. I tried a binomial model, wait to the end of the game and it turns out to be quite good. Here, I’m applying the model to two Spur’s matches for two reasons: I don’t want to cheat using favorable data; and the data from my gf’s match is not available on the internet.

The model

In a Basketball game, you have 4 times of 12 minutes each (NBA). Given the information on the first time, I want to predict the final result. For instance, suppose team 1 made x1 goals in the first quarter. Then, the probability of making a goal in a given second is: p1 = x1/720, because 720=12*60.

In a complete game you have 2880 seconds, therfore, the final score follows a binomial distribution with parameters p = p1 and n = 2880.

I will show here two examples using the last Spurs’ games.

This games finished with 121 points made by Spurs and 92 made by Wolves. In the first quarter, the result was 29 to 26. Using this data, I’ve computed the distrbution of final results

As can be seen, it was very clear that spurs were going to win the game. Actually, the probability of winning the game was 78.3%. The expected result was 116 to 104, but the actual results was in the 95% interval for both teams.

This game was more interesting. The final result was 87 to the Nets and 99 to the Spurs while in the first quarter, it was 18 to 25 respectively.

The probability for the Nets to win this game was as low as 1.7%

The expected result was 100 for the spurs, which is almost the actual result, telling that they behaved the whole match as the first quarter. On the contrary, the expected result for the Nets was 72, much lower than the actual result. Even more, the probability of getting a result at least as big as the actual result is 4.6%. This may imply that something in the final score is different to the first quarter.

Actually, the Nets’ full score for quarters was 18 18 28 and 23. Clearly, something changed between the second and third quarter, and I want to know what.

Handball game

For handball, the number of goals per match is lower than for basketball so, I found that using minutes instead of seconds gives good results. In the first third of the match (minute 20 of 60), the score was 10 to 6 to the Home and Away. With this data, we compute the binomial distribution.

As you can see, the final result, with the information given in the first third, was quite polarized. The probability of the away team to win was less than 1%. The expected result was: 30 and 18 while the actual result was 30 and 21. The prediction for local was perfect, and the probability for away to have the actual result (or above) was 25%. This means that they did a great job to increase their chances. I was happy with this model and also happy of watching Melisa making goals, moving the bell to the right.