This is Part I of the Season Score Metric v2.0 diary series. Part II is here.

Last week, I set out to create a "Season Score" metric in an effort to provide meaningful context beyond merely record to compare the last few seasons under Harbaugh to the Michigan glory years. The result was this diary and a scoring metric that was well-received for its novelty, but thoroughly critiqued for its huge limitations. To summarize, here is that scoring metric:

(Wins x Win%) + Quality Wins - Bad Losses + (Wins vs. Opponents with Winning Records x Win% vs. Opponents with Winning Records) + 5 if Beat Ohio + 3 if Beat MSU + 3 if Win Bowl + 5 if B1G Champion

And here are the major caveats:

The relative strength of OSU and MSU in a given year is ignored

A Quality Win vs. the #1 team is treated the same as a win vs. the #25 team

Shared vs. outright B1G Titles are treated the same

All advanced stats are ignored

The responses to the diary also pointed out that the metric was missing the major components of expectations and improvement. It rankled at least one person that Harbaugh's seasons were rated relatively poorly, as there was no credit given to the huge turnaround he managed so quickly.

As a result, the Season Score metric created a couple wonky rankings, and didn't pass the eye test for several seasons. As user Lumpers pointd out:

"...There is no way the 1985 team with that defense (3 straight shutouts during the season), a 10-1-1 record and a victory over Nebraska in the Fiesta bowl is not one of the top 10 seasons in M history over the past 49."

And so, I decided to go back to the drawing board and try to address some of these issues. If you want to get right into the new metric, then feel free to skip ahead or check out the new metric for yourself. However, before I dive in, I justed wanted to quickly touch on the merit of a season score metric in general.

Is a successful season more like a fine wine or an S&P ranking?

~~Chuck says, "The answer is always wine."~~

I was initially surprised by responses claiming that season success is an inherently subjective matter. User ChiBlueBoy summarized:

"I also appreciate trying to put some numbers to something that will always be subjective (in Jr. High I created a mathematical formula to determine if someone was "attractive," so the desire to quantify the subjective resonates with me)."

My first reaction was to scoff at this comparison. Perhaps only because I had spent a decent chunk of time making the scoring metric, but I viewed the idea as more like S&P and trying to make objective measures of a team's offense and defense. Just like it is valid to say that a running play is objectively "successful" if it gets at least 4-5 yards on first down, it is valid to say that objective components of a "successful" season include beating our rivals and winning the Big Ten title.

I think ChiBlueBoy mentioned the key contention:

"In the end, all of this is very subjective, and I imagine that each of us would come up with a different formula."

Very true. In fact, I've done just that and made an entirely new metric. However, I would counter that some formulas are objectively better than others. And, if you continue reading, I think you'll agree that this new metric is objectively better than the previous one. Nevertheless, the point still stands to a very real degree. This new and improved metric doesn't account for everything and changing the metric weights here or there would alter the rankings significantly.

Ultimately, the most powerful utility of this metric is therefore comparing seasons between tiers, rather than within. All else being equal, a season that ends with beating OSU is objectively better than one that ends with losing to OSU. But, was the 1980 Michigan team better than the 1985 squad? Well, Chuck says that exercise is as subjective as comparing fine wines.

I think an apt analogy would be that this metric is like PER for basketball players. It provides meaningful context to make comparisons between players and gives objective evidence to say that LeBron James is better than Reggie Jackson. But PER alone can't answer who is better between, say, Steph Curry and James Harden.

Season Score Metric v2.0

The new metric is computed using the sum of several individual component scores as follows:

Season Score = Expectation & Improvement Score + Wins Score - Loss Score + OSU Score + MSU Score + Bowl Score + B1G Champ Score + National Champ Score

In Part I, I'll go through each component metric individually. Part II will look closer at this season's score and try to project the future based on previous responses to down years in Michigan history.

Expectation & Improvement Score

Had MGoBlog existed all the way back in 1969, this would have been a much easier exercise. Unfortunately, there is no easily accessible database of well-informed predictions for how the Bo, Moeller, and Carr teams would perform going into the season. What we do have is preseason polls. Now, I acknowledge that the preseason AP poll is almost entirely pure conjecture. But it does give some standard measure of how Michigan was expected to perform relative to its peers every season. And we would all agree that it was disappointing when the preseason #5 2007 Michigan team finished #18, and that the 1997 season was especially amazing considering they were ranked #14 going into the season.

So, long story long, the expectation component is derived from the preseason AP ranking. I did this by taking Michigan's final AP ratings over the last 49 years (and used record rank when they were unranked by the AP) and graphed it against the seasons' winning percentage:

I then plugged the Preason AP ranking into the formula from that fit line to get Expected Win%, and then multiplied by the number of games in a season to get Expected Wins. The Expectation Score is thus:

Actual Wins x Actual Win% - Expected Wins x Expected Win%

As an example, Michigan was ranked #7 going into last season, which correlates to 10.4 wins. This season, Michigan was ranked #11 going into the season, which correlates to 8.8 wins (for the regular season). IMO, these pass the eye test.

The Improvement Score was based on a much simpler formula:

Wins x Winning% - Previous Season Wins x Winning%

Finally, the Expectation & Improvement Score are weighted:

(Expectation Score + Improvement Score) / 3

Wins Score

My previous metric looked only at "Quality Wins," which were defined as "wins against opponents that were ranked (in the AP poll) when they played Michigan AND finished with a winning record, OR wins against opponents that finished the season ranked (In the AP poll)."

The problem was that it weighted a win against the #1 team the same as against the #25 team, and gave no boosts for beating the #26 or #27 team.

This new metric scraps the idea of Quality Wins and instead just looks at the winning percentage of defeated opponents with the Michigan game removed from their record:

Wins Score = Wins x Win% x Opponent Win % (Mich game removed)

Losses Score

The previous metric included "Bad Losses," which were defined as "losses against opponents that were not ranked (in the AP poll) when they played Michigan AND were not ranked (in the AP poll) at the end of the season."

This again relied too heavily on AP rankings. The new metric scraps the idea of Bad Losses and instead looks at the loss percentage of opponents that defeated Michigan (again with the Michigan game removed):

Losses Score = Losses x Loss% x Opponent Loss % (Mich game removed)

Rival Score

The previous metric gave 5 points for an OSU victory and 3 points for a MSU victory and 3 for a Bowl Game win.

This totally ignored the fact that a win against a 3-9 Michigan team is not as impressive as winning the Rose Bowl.

The new metric accounts for the rival and bowl opponent winning percentage, and is broken down as follows:

OSU Score: If win, then +3 x OSU Win % (Mich game removed); If loss, then -3 x OSU Loss % (Mich game removed)

MSU Score: If win, then +1.5 x MSU Win % (Mich game removed); If loss, then -1.5 x MSU Loss % (Mich game removed)

Bowl Score: If win, then +1.5 x Bowl Opp Win % (Mich game removed); If loss, then -1.5 x Bowl Opp Loss % (Mich game removed)

B1G Champ Score

Whereas the previous metric awarded 5 points for every Big Ten title, the new metric differentiates between shared and outright titles:

B1G Champ Score: If outright, then 3; If shared by multiple teams, then 3 / # of teams sharing; If Division Champ but lose Conference Championship, then 1.5

National Champ Score

Finally, the last component accounts for winning the national championship, with theoretical playoff wins along the way:

National Champ Score: 5 points for National Championship; 3 x Playoff Opp Win % (Mich game removed) for every Playoff victory

Top Ten Seasons

Year Coach Season Score Record Final AP Preseason AP E&I Score Wins Score Loss Score OSU MSU Bowl B1G Nat. Champ 1997 Carr 23.03 12-0 1 14 3.39 6.82 0.00 2.50 0.95 1.36 3 5 1980 Bo 12.52 10-2 4 11 1.38 4.11 0.09 2.45 0.45 1.23 3 0 1985 Bo 12.42 10-1-1 2 29 2.66 5.14 0.02 2.45 0.95 1.23 0 0 2011 Hoke 10.78 11-2 12 34 2.97 5.50 0.11 1.50 -0.35 1.27 0 0 1971 Bo 10.40 11-1 6 4 0.85 4.09 0.02 2.00 0.90 -0.41 3 0 1989 Bo 10.23 10-2 7 2 0.00 4.27 0.04 2.18 1.09 -0.27 3 0 2003 Carr 10.13 10-3 6 4 -0.44 4.11 0.17 2.75 1.00 -0.13 3 0 1991 Moeller 10.07 10-2 6 2 0.00 4.47 0.03 2.18 0.45 0.00 3 0 1988 Bo 10.01 9-2-1 4 9 0.41 3.23 0.02 1.20 0.82 1.36 3 0 1986 Bo 9.74 11-2 8 3 0.07 5.01 0.10 2.50 0.90 -0.14 1.5 0

The 2011 Hoke season sticks out like a sore thumb, not only because of the doom that followed, but also because that team wasn't really that great. It is the only one of those seasons that is being propped up by the Expectation and Improvement Score. However, that season certainly did exceed expectations and was a huge improvement over the previous year, and it was certainly an exceptional season (for both positive and negative reasons). Outside of that year, this metric performs much better than the previous one on the eye test.

The 2011 Hoke Season Score does serve as a reminder that this is metric does not measure the true quality of a team. The 2006 Carr team was an all-time quality team, but the team missed out on glory and thus doesn't match up to these seasons in terms of overall success. The 2016 Harbaugh team was certainly better than the 2015 team, but the 2016 team underperformed relative to expectations whereas the 2015 team greatly exceeded expectations and made huge improvements relative to the 2014 Hoke disaster.

Performances by Coach

As I explained earlier, the greatest utility of this metric is to break the seasons into tiers. This is the breakdown of the seasons into quartiles by Season Score, with Final AP rank as the eye ball test:

The biggest oddball is the 1975 Michigan squad, which finished 8-2-2 and ranked #8. The season score however puts it in the lowest quartile. That seems valid to me, given that they came into the season ranked #2, lost to OSU, lost the Orange Bowl, didn't get any share of a Big Ten title, and only beat a bunch of body bags. Sounds pretty bad to me, but I wasn't alive then, so I invite any MGoHistorians to add their perspective.

With those season quality quartiles established, we can look at how each coach has performed:

I hope this provides a little more perspective for people. Even Bo had down years. When you look at Season Scores over time, you get an even better sense that Michigan's success has ebbed and flowed. It looks like a company stock:

And while the stock may seem to be trending down recently, the trend is hardly significant:

I'll end Part I by saying that we should all know that football teams have their ups and downs. In Part II, I'll show that a key to the program's success in the glory days under Bo, Moeller, and Carr was that we were able to roll with the dong punches and snake bitten seasons and respond with great ones. And I'll look back at similar down seasons to project the future.