

I stopped writing regularly for this blog a few weeks ago because of work. Many a night now spent at the computer, but not watching hockey. So in the most recent post-game, when I was asked to explain what Pythagorean expectation meant, well I wasn't going to answer. For one, I don't have that much free time to spare these days, and for two, why reinvent the wheel, right? This is the age of Google, after all.



But I can't let a comment like this go:

Good Lord I didn’t realize that a teams projected point total based on their GF and GA was related to the equation for solving the hypotenuse of a right triangle (surely you know the formula a^2 +b^2 = c^2). The Pythagorean Point Equation eh? Please explain this formula to me R O. You posted the link so you should be willing to back it up. Maybe you could complete a derivation of it so I can gain a better understanding. Or you could complete an uncertainty analysis based on the projected point totals and actual point totals of all thirty teams using their GF and GA (I will explain this if needed) to prove that it isn’t complete bogus

Two things:

1.) My personal belief, which Kent may not share (this is his blog after all), is that commenters who want to move the conversation forward should be ready and expected to do some legwork themselves. There are a ton of resources out there (www.nhl.com, www.behindthenet.ca, www.hockeyanalytics.com, etc. etc.) and Microsoft Excel is also your friend here.

2.) Asking why P% can be predicted by an a^2 + b^2 = c^2 formula is missing the forest for the trees. Again, my belief (and one that Kent may not share as well) is that this whole hockey-stats movement is not about glorifying seemingly meaningless numbers. It's about gaining insight into hockey - measuring what's important, seeing what drives success, and more importantly, what doesn't drive success.

So the right question to ask isn't "why does a^2 + b^2 = c^2 predict points percentage"?? The right question to ask is "is it really possible to predict how often you pick up points simply by looking at how often you score and how often you get scored on"??

Because, folks, if this is possible, then BIG GOALS and BIG SAVES are mirages.

Pythagorean formula: Points% = GF^2 / ( GF^2 + GA^2 )

I was asked what the origins of the Pythagorean point expectation was. I believe its origins are in baseball, and as far as I know it's an empirical equation. Basically some sharp guys decided that they ought to take runs scored for and against and try and map it to wins, and the Pythagorean formula fit best.

Alan Ryder of HockeyAnalytics did some research on a more improved model for goals->wins. Some really crazy stuff, you can check it out here under "Probability Models". The point is not the model though, it's the assumption he made: goals are rare, uniformly random and memory-less.

Read that again.

Goals are rare.

Goals are uniformly random.

Goals are memory-less.

Alan Ryder did a quick check that suggests with a high degree of convincing-ness that the above is true, except for in the first minute of each period and the last minute of the third period. There are some conjectures as to why these exceptions occur (the last minute of the game is empty-net time, the first minute of each period contains less time spent in the zone during PPs) but the exceptions are less interesting than the main result.

For you see, if goals really are rare, and uniformly random, and memory-less, then there's not very much real estate for the BIG GOAL and BIG SAVE to coexist with regular, boring hockey truth. No matter how much a skater might elevate his game to the level of CLUTCH or how well a goalie just makes BIG SAVE after BIG SAVE because he wants to win more, goals keep getting scored and saves made in the same measures, minute after minute after boring minute.

And this ties in to Pythagorean expectation quite well. Because sometimes you look at a team that just keeps on winning, but without the goal differential to back it up. For example, the Avs. And their troll fans will come to you after winning against your team despite playing like garbage, and when you point that out one of the usual retorts is "we found a way to win" or "we scored a big goal and our goalie made some big saves".

And that, friends, is absolute grade-A bullshit. Because point percentage can be modeled by the Pythagorean expectation. And even more important, point percentage can be modeled by randomly distributing goals scored for and against over 82 games.

Observe:

The red dots are true points% for the 120 post-lockout team seasons. By "true points" I mean 2 points for a win (regulation or OT), 0 points for a loss (regulation or OT), 1 point for a tie. Shootouts and loser points are gone, the model doesn't account for that. And before anybody says anything, you can augment the model to account for shootouts by adding a coinflip component in the case of ties. See point 1.) above about legwork.

The blue line is what the Pythagorean expectation predicts. Pretty straight down the middle eh? I didn't calculate R^2 or what not, that doesn't really interest me - more insight can be gained by the green dots.

The green dots, ah the green dots, these are points %'s from simulations that I did for every team season. Basically I took the GF and GA of each team-season and randomly distributed them (with uniform distribution) over 82 games. Did that 500 times per team season, calculated the point %, put it on the plot.

You see? EVERY team season in the post-lockout is well within the range that one would expect from a parallel universe where there was no dispute about BIG GOALS or BIG SAVES - where goals are scored when they are scored and goalies make saves as best they can and we don't ever talk about clutch or elevating your game.

Maybe, just maybe, that parallel universe is this one.