If you recall, last week, I talked about one approach that we can take for evaluating starting pitcher performance. Today, I’d like to continue on that vein, this time taking a look at relief pitching.

With regards to evaluating both player performance and player talent, relief pitching is one of the least understood aspects of baseball. There are a few factors that lead me to believe this, but the only one I’d like to talk about today is the problem of mid-inning pitching changes.

You’ve probably noticed the ridiculousness of the following situation: A starting pitcher grooves through the first six innings of the game, allowing just one run. However, he loses his control with two outs in the seventh and walks the bases loaded, forcing the manager to call on the bullpen to get the last out. The reliever proceeds to allow a three-run double, followed by the third out.

Of course, in this situation, the starting pitcher is held responsible, and subsequently penalized for, all three runners that he left on base when the manager removed him from the game. Conversely, the reliever, despite being the active pitcher while all three runs scored, is rewarded for pitching a scoreless third of an inning. In the box score for the game, there is no indication that those three runs allotted to the starting pitcher in fact scored while said pitcher was sitting on the bench.

Anyone will agree that this is not only unfair to the relief pitcher, whose salary and legacy is strongly tied to his ERA, but a significant flaw in the way that we allocate runs allowed. It is particularly important for relief pitchers given their small number of innings pitched each season, and subsequently the larger importance of each run allowed.

On the surface, there is not a simple solution to this issue. For obvious reasons, we cannot assign all inherited runners to the reliever, nor can we assign none, and any whole number in between will likely be an arbitrary number.

Luckily for us, there is a fantastically simple solution: RE24. While it has a complicated name, RE24 takes the team’s expected runs scored in the inning before the play and subtracts that from their expected runs scored in the inning after the play. In other words, how much did the hitter, or conversely the pitcher, help his team maximize runs scored?

Let’s apply that to the case of inherited runners, specifically the one presented above. When the manager brings in the reliever with the bases loaded and two outs, the batting team was expected to score about 0.7 runs for the rest of the inning given an average offense and defense. However, the reliever then gives up a three-run double followed by the third out. At the end of the inning, the batting team obviously has a run expectancy of zero. Therefore, the run expectancy dropped from 0.7 to 0 from when the reliever entered the game, but three runs scored. The 0.7 run change in run expectancy minus the three runs scored means that the reliever’s RE24 for the inning was -2.3. In contrast to the official Runs Allowed, in which the reliever is assigned zero runs, RE24 assigns him almost all of the runs that scored, minus the runs that were expected to score when the starter left the game.

Make sense? Good. Let’s look at some actual numbers.

Top ten reliever seasons by RE24, since 1974:

Mark Eichhorn’s epic season in which he pitched 157 innings with a 1.72 ERA all in relief leads the way, and it’s not close. There are some other big names on this list, as well as some not-so-big names, but regardless, these are ten of the best relief seasons in the past 40 years.

While the above list is somewhat interesting, it doesn’t really tell us everything we want to know. First of all, RE24 is a cumulative metric, as evidenced by the fact that almost every one of those relievers pitched over 100 innings, a feat rarely seen in recent years. Secondly, the scale means very little for most people. There is nothing to compare that number to as far as we normally measure pitching, making it difficult to practically use in conversation and discussion.

These two concerns can be solved in one simple way: convert RE24 to a “runs-per-nine-innings” scale similar to ERA and RA. At this point, I must give credit where credit is due, as Tom Tango provided the method for converting RE24 to an RA9-scale yesterday:

[W]e can recast RE24 into an RA9 scale (i.e., similar to ERA) as follows. Say the league average is .48 runs per inning. Say you have a pitcher that has an RE24 of +40 runs and has pitched 200 innings. That means the league average is .48 x 200 = 96 runs, and our pitcher here is 40 runs better than that, or 56 runs allowed. So, his (RE24-based) RA9 is simply 56/200*9 = 2.52.

Don’t get bogged down in the details of the calculation. The result is a metric which measures the pitcher’s runs allowed per nine innings using RE24 instead of actual runs allowed. We’ll call this Context RA9, or cRA9, going forward.

The great thing about cRA9 is not just that we have a more accurate measure of a relief pitcher’s run prevention, but that we can compare it to their RA9 in order to identify the best and worst pitchers at preventing inhereted runners from scoring.

First of all, the leaders in reliever Context RA9 since 1974:

Wow. I remember David Robertson’s 2011 season as being very good, but based on RE24/cRA9, it was historically elite. He is known as “Houdini” for his ability to pitch out of seemingly impossible jams, and these numbers are just further evidence of that. Amazingly enough, Al Alberquerque had an equally impressive season the same year, albeit with fewer innings pitched.

A noted above, it may also be interesting to look at the relief pitchers with the largest positive differences between their Context RA9 and their normal RA9 — that is, the pitchers who received more credit than they deserved from RA9 and ERA:

This list, above everything else, is evidence that reliever ERA or RA9 can be, frankly, horrible indicators of a pitcher’s actual performance. By RA9, pitchers like Quisenberry and Franco look fantastic, but by cRA9, we might think them deserving of demotion to the minors.

On the other side of the coin, here are the ten largest negative differences in cRA9 and RA9 since 1974, or the pitchers who were better than their RA9 and ERA indicated:

Again, these numbers are indicators of the unreliability of conventional run prevention metrics with regards to relievers, especially relievers that often come into games in the middle of an inning. By cRA9, we see a list of mostly very good, possibly elite, relief pitchers, but by RA9, we see a list of pitchers only worth appearing in blowouts.

The issue presented at the beginning of this piece should not be a controversial one. The idea that starting pitchers should be responsible for all runners left on base is ridiculous, and the idea that relievers should not be responsible for said runners is just as ridiculous, but more important. As the results above show, a relief pitcher’s performance when brought in mid-inning can be the difference between one of the best reliever seasons in baseball and one of the worst.

So next time you see a relief pitcher’s ERA, consider the context of their appearances and their RE24 compared to their peers before you come to any conclusions.