By the way, I hate math. Love statistics, but hate the math to calculate them. It feels like homework to me. I’m going to do my best to explain the tests I used as we go along, but I’m not an expert.

Using a (linear) Pearson correlation on a scatterplot (we debated whether Pearson was the best correlation method for these tests over ASA Slack, but ultimately the data scientists, which I am not, determined it was) over the course of the last four seasons we see that points-per-game and goal differential per season are highly correlated, as you would expect. In other words, to get points you need a higher goal differential. We understand this intuitively at a match level because a team can’t get three points without a higher goal differential, and an even goal differential gets a team one point.

But some teams have made the playoffs with an overall negative goal differential (witness the 2017 San Jose Earthquakes with a -22 goal differential sneaking into the playoffs). R here shows the correlation coefficient as 0.9, which statistically speaking indicates 81% of the variance in one variable (points-per-game) can be explained by the variance (goal differential) in the other variable. Simply put, that’s a really strong correlation between points-per-game and goal difference. We can use this for comparisons with other correlations. If you are wondering what the p-value is, it’s a measure of the likelihood that our conclusion is valid. Any value less than or equal to 0.05 is a good sign (see null hypothesis for a more detailed explanation), so we’re also good there.

At a game level, this gets a bit messier because of the jump from one to three points for a positive outcome (rather than one to two points).