In soccer, the relationship between goals and points is a very curious one – naturally, in order to win a match, you must outscore your opponent. However, the relationship between goals and points is not a direct one. Today let’s take a look at the relationship between goals scored, goals conceded, final points, and how an individual’s performance effects the point total.

What is the Pythagorean Expectation formula

The famous Baseball theorist Bill James discovered an interesting relationship between runs scored, runs conceded, and winning percentage in Baseball. His original formula can be described as the following:

Where Win% is the winning percentage of the team, s is the number of runs scored, a is the number of runs allowed, and x is an exponent. Originally, x was 2, hence the name Pythagorean Expectation. Plug in the number of runs scored, and the number of runs allowed, and you would get the expected winning percentage of the team. To get the number of wins that the team can be expected to win, simply multiply the winning percentage by the number of games in a single season (162 in Major League Baseball).

Overtime, the Pythagorean expectation theory has gained mainstream acceptance in the baseball statistics community. The formula has been repeatedly refined, with the exponent changing to represent the changes that have taken place in the league scoring environment, the amount of runs a team is expected to score per game on average. The exponent changes season by season, due to rule changes, technology changes, stadium changes, and other changes that effect the overall amount of runs scored per game on average in Major League Baseball. After considering the varied factors that affect the number of runs scored per game, the current exponent is 1.83.

The Pythagorean Expectation is the percent of games that a baseball team is expected to win based on the number of runs scored and the number of runs conceded. There is a margin of error, often very small, as the long Major League Baseball season tends to discard statistical noise. Usually the margin of error is described as the effects of luck, or the effects of bench players. Since bench players have very little playing time, it is difficult to gage their true talent level.

The Pythagorean expectation formula is a breakthrough in baseball research, and over time, it has been adapted to most sports with either no draws or very little draws. For instance, the formula works quite well in basketball too, but instead of an exponent of 1.83, the exponent is 14 to better represent the significantly higher amounts of points scored in a basketball game.

Applying Pythagorean expectations to soccer

Soccer is a sport that 3 presents potential results. A team can win, lose, or draw. The final table is not determined by winning percentage, but points, where teams get 3 points for a win, 1 point for a draw, and zero points for a loss. The Pythagorean model thus cannot be simply adapted into soccer, changes would have to be made to compensate for the 3 possible outcomes instead of 2.

Over at fivethirtyeight.com, Carl Bialik has released research on the application of Pythagorean Expectations in soccer. According to Bialik, the best scoring margin does not assure a championship, but the ratio of goals scored/goals conceded can allow us to predict the total points that a team would have. Over the long term, the Pythagorean Expectations formula can accurately predict the number of points a team has from its goals scored and goals conceded accurate down to a 4 point margin of error. Let P be the number of points the team is expected to take, G the number of goals scored, C the number of goals conceded, and T the total amount of possible points, Bialik’s formula is the following:

The exponents 1.18 and 1.23 were calculated from the number of goals scored and goals conceded per season, and this is a number that will change depending on the average amount of goals scored per game. T represents the total amount of possible points per season, which is the total number of games multiplied by the number of points awarded for a win. In the Premier League, the number of 114, since each team plays 38 games and 3 points are awarded for each win.

Now here we have to talk about what the Pythagorean Expectation actually implies. The Pythagorean Expectation is simply an expectation, it is an algorithm that predicts the number of points a team is expected to earn through the course of a season through the number of goals the team scores and concedes. Over the long term (multiple seasons), the Pythagorean Expectation is very accurate at predicting the number of points that a team can expect to earn. However, sample size in a single season is not large enough for the Pythagorean Expectations to accurately predict a team’s finish, luck plays too big a role for the Pythagorean Expectation to be able to accurately predict the finish within a single season. Over the course of several seasons however, it does allow us to see important trends and make predictions.

The original Pythagorean Expectations formula led to a breakthrough in Baseball research. It has led to the discovery that over a full season, a player that contributes 10 more runs (either offensively or defensively) leads to the team winning 1 more game. We can reach a similar conclusion in soccer, but it is slightly more difficult.

First let’s consider offensive contributions. How many points is one more goal scored worth? Let G be the number of goals scored, and C be the number of goals conceded. The following equation gives us the marginal value of one additional goal as expressed in a change in points:

How many extra points would a team get if they score a number of extra goals? Again plug in G for goals scored, and C for goals conceded, and x for the number of additional goals, the number of points a team can be expected to win is:

The value of those additional goals scored as expressed as points is:

Defensively, the math is very similar. For each goal that a player prevented, we simply subtract 1 from the number of goals conceded. Using the same variables as above, the value of preventing a goal becomes this:

And the number of points a team can be expected to win after preventing an additional x shots is:

What does it mean?

So first of all, let me reiterate what the Pythagorean Expectation means. The Pythagorean Expectation is the amount of points a team is expected to earn with their goals scored and goals conceded. Over the course of a single season, a team’s actual record might deviate from their expected record due to factors such as luck, officiating decisions, and/or blowouts. However, over the course of several seasons, the Pythagorean Expectation is surprisingly accurate in predicting results, as factors such as luck tend to even out.

Using Pythagorean Expectations, we can reach two interesting conclusions:

1. An extra goal scored is not as valuable as preventing a goal from being scored. Fans love to mock teams that park the bus, but in reality, parking the bus is a remarkably difficult skill. For a team to score, they would usually have to take more risks, which then gives its opponents an opportunity to score. One of the most important skills for a team as a whole is to defend and to keep a lead. Scoring alone does not insure the 3 points, being able to prevent goals while scoring is the most important skill, and thus, preventing goals is valued a bit more. 2. The goal differential is not a very good measure of a team’s relative skill, instead, a much better measure is the ratio between goals scored and goals conceded. A +10 goal differential looks good, but 110-100 is much less impressive than 59-50. The first team can expect to finish with 54 points, but the second team can finish with a much better result at 59 points. The ratio is significantly more important than the difference.

Applying the theory

Now let’s look at how Pythagorean expectations can be applied in discussing hypothetical future scenarios and team construction. Consider the following example:

Manchester United came in 7th this season with 64 points, 64 goals scored, and 43 goals allowed. Plugging in the numbers to the Pythagorean Expectations equation, we can expect 64.96 points, a small deviation of less than 1 point from what actually happened. What would happen if they upgraded their forwards?

Imagine if they somehow they managed to upgrade their forwards to the point where they scored 74 goals instead? With 74 goals scored and 43 goals conceded, Manchester United can expect 69.6 points, or if we round up, a 5 point difference. What if they decided to upgrade their defense instead? If they allowed 33 goals instead of 43, they can expect 74.7 points.

A second major use of the Pythagorean Expectation formula is to evaluate how lucky a team was in a season. Remember, a team’s Pythagorean Expectation is the number of points that they can expect to finish with their number of goals scored and goals conceded. But remember, in reality, the distribution of those goals scored and goals conceded heavily affects the results.

For instance, over the course of the 2013-2014 season, Tottenham was the most “lucky” team. With their goals scored and goals conceded, they can be expected to finish with 53.9 points, but they actually finished with 69 points. This implies that the team was lucky, that the distribution of goals heavily favored them. This is partially based on skill and talent, but also heavily dependent on luck. Thus, we can say that Tottenham has been especially lucky this season. As for Swansea, they were the unluckiest team in the league. They finished more than 9 points below what they could have been expected to finish with.

From 1992 -1993 to 2013 -2014, no team has ever over or underperformed its Pythagorean Expectation on average by more than 4 points. Team’s might get lucky one season, and overachieve, but that team can be expected to regress towards the mean in the near future. No team ever stays lucky, and nobody can ever consistently be lucky. For instance, Tottenham finished with 69 points when they were expected to finish with 54 points. Maybe they got exceptionally lucky, maybe their players were just at the right place in the right time. What the Pythagorean Expectation tells us however, is that if they score the same amount of goals, and let in the same amount, they should be expected to finish closer to 54 points, rather than with 69 points, with a massive point difference of +15 – like it or not, luck can and will always play a role in professional sports, no matter what the numbers are telling us.

The Pythagorean Expectation is an important tool for soccer research. It allows us to project the effects that additional goals scored and conceded will have on the team’s overall result. It also allows us to have an idea of which teams “over performed”, which teams “under performed” and what we can expect from those teams moving forward into the next season and beyond.