Trevor Hoffman is in the Hall of Fame. Next to Babe Ruth and Willie Mays, next to Pie Traynor and Jesse “Pop” Haines, next to Cap Anson and Jackie Robinson. Next to catchers who caught without gloves and pitchers who pitched without mounds and guys who spent their careers watching in suits from the dugout.

The Hall of Fame houses all sorts. The one thing they have in common is that each left some mark on the game. For Hoffman, that mark is the culmination of nearly a century of evolution in baseball’s bullpen.

Hoffman was also a great pitcher. He could not have left the mark he did were he not. His legacy, though, is more complicated than just being a great pitcher. It involves a lot of history and context that help explain why Hoffman—and not other, similarly great pitchers—became so iconic.

The Long, Slow March

Baseball has always struggled with what exactly to make of relievers—how to use them, how to value them, how to decide who should even be one in the first place. For most of the 20th century, the role was in a state of constant evolution, making it even more difficult to compare relievers from one era to the next.

Originally, relief pitchers were just starters coming in to back each other up on days when they didn’t have it. Guys like Cy Young and Kid Nichols and Mordecai “Three Finger” Brown used to lead the league in saves (or would have, had saves been invented yet). Later, as rosters expanded, teams started employing specialists—guys who had failed as starters or didn’t have the stamina or stuff to last nine innings but could be useful for few innings at a time. These were the first relievers.

Specialists emerged among the specialists—mop up guys whose job it was to come in when you were getting blown out and didn’t want to waste innings on someone else’s arm, or flamethrowers who could pitch out of a jam. Teams started to understand leverage and platoon advantages and other nuances that only fully came to life in the bullpen, and new roles and patterns emerged to capitalize on them.

Then in 1969 came the save. For the first time, teams had concrete terms by which to define the contributions of their best relievers. Usage patterns started coalescing around these terms, and the long, slow march toward the modern closer—one shaped almost entirely around the save statistic—began.

Hoffman came along at precisely the right time, just as this march was grinding to a halt. In 1993, the year Hoffman broke into the league, Lee Smith became the first pitcher to reach 400 saves. Lee Smith, the once future Hall of Famer, was the guy we used to think was Trevor Hoffman, the first embodiment of the save.

But the save was still evolving. Smith was the last man to lead his league with fewer than 30 of them, topping the NL with 29 saves in 1983. In 1991, with an ERA nearly seven tenths of a run higher and with 30 fewer innings pitched than in his ’83 season, Smith topped 40 saves for the first time. The next year, his ERA rose another eight tenths. He topped 40 saves again. In 1993, another seven tenths on his ERA, another 40-save season.

In his 20s, Smith averaged 4.3 outs per appearance. In his 30s, that dropped to 3.2, the same as Hoffman averaged over his career. The new era finally had arrived, but for Smith, it came too late. You can blame it on a simple twist of fate.

The Baffling Submarine

There has perhaps never been a more vexing closer than Dan Quisenberry. Over his career, he struck out hitters just better than once every three innings. His submarine fastball topped out around the mid-80s, if that. He wasn’t particularly hard to hit. And yet, teams found him every bit as hard to score on as Hoffman.

Quisenberry was, even more than Smith, lost in the final burst of the transition toward the modern closer. In 1983, the same year Smith led the NL with 29 saves, Quisenberry became the first pitcher to record 40 saves in a season and the last to lead the league in saves while averaging over two innings per outing. Each of the five seasons Quisenberry led the American League in saves, he threw at least 120 innings.

Unlike Smith, though, Quisenberry never got to take advantage of that final transition. By the time teams finished settling on the save situation as the ultimate arbiter of closer usage, some combination of age and all those innings already had taken their toll. As a result, Quisenberry never really had the chance to run up his save total the way later closers could.

Aside from the save total and the fact that he never really struck anyone out, though, Quisenberry’s career actually looks pretty similar to Hoffman’s. Hoffman had several seasons during which he pitched 60-70 innings as an ace reliever. Quisenberry had not-quite-so-many seasons during which he pitched 120-140 innings as an ace reliever. That makes it a little hard to compare them on a seasonal basis, but we still can start there.

A Hardball Times Update by Rachael McDaniel Goodbye for now.

I’ve graphed each of Hoffman’s and Quisenberry’s seasons according to RA- (runs allowed divided by league average, adjusted for park) and innings pitched (larger dots indicate more innings). Each pitcher’s seasons are ordered from best to worst by the total number of runs allowed below average, taking into account both RA- and IP:

I say they’re similar, but in the graph it’s kind of hard to compare. Hoffman clearly has a greater number of productive seasons, and in his best years was allowing fewer runs than Quisenberry, but the impact of Quisenberry’s best seasons is magnified by his having thrown so many more innings. It’s not immediately apparent which is actually more productive.

Here’s why I say they’re similar: What if we were to take every season in which Quisenberry threw over 100 innings and cut it in half, so that instead of a handful of 120- to 140-inning seasons, he has a bunch of 60- to 70-inning seasons like Hoffman?

I went through each of those 100-plus-inning seasons game by game, and as soon as Quisenberry crossed the halfway point to his season innings total, I cut it off and started a new season. When I do that, and include their career totals, suddenly he looks an awful lot like Hoffman:

Over his career, Quisenberry allowed 356 runs in 1043.1 innings. Hoffman allowed 378 runs in 1089.1 frames. Adjusting for park, league, and era, that works out to about 69 percent of the league average for Quisenberry and 71 percent of the league average for Hoffman. If you cut up Quisenberry’s career into seasonal chunks the same size as Hoffman’s, their seasonal performances also become pretty similar.

Imagine you could eliminate five seasons from the peak of Hoffman’s career and make him re-pitch them right on top of five of his other seasons, so that every time he finishes one outing, he has to go right back out and pitch another the same day with no rest in between. Now imagine Hoffman doing that and still managing to replicate his success from the five original seasons you threw out.

That would be pretty impressive, right? That’s essentially what Quisenberry did.

Now imagine that you could go back and cut off all of Quisenberry’s outings after he completed his first inning so that he’d have thrown 60-70 innings a year like Hoffman. Then you give him a chance to make those innings up at the end of his career by having him pitch into his 40s like Hoffman did. He ends up pitching just as well as he did in his prime so that his overall numbers aren’t really affected either way. That’s impressive too, yeah? Well, that’s basically Hoffman.

Which is more impressive? I don’t know. They’re both impressive, and it isn’t immediately clear to me which I’d prefer.

If you dig a little deeper, you can find reasons to believe Hoffman was legitimately the better pitcher. Quisenberry pitched in front of better defenses, so Hoffman’s runs allowed is more impressive than the initial comparison shows. And, of course, Hoffman’s FIP is much better, even with Quisenberry hardly ever walking batters or giving up a home run.

Without looking at saves, though, you have to dig deeper to see that. I don’t think that is at all what most people would expect if you asked them to compare Hoffman to Quisenberry. Certainly the BBWAA wasn’t looking at FIP or Kansas City’s Total Zone Rating when all but 18 of them decided to leave Quisenberry off their HOF ballots in 1996.

I think there’s a very real chance that had Quisenberry been the one to come later, had his thousand or so innings been stretched out over 677 save opportunities while Hoffman’s were spent shutting down teams six outs at a time in the 1980s, Quisenberry might very well be in the Hall of Fame.

The First Steps

The story of the closer begins with Firpo Marberry.

Before the days of PA systems pumping “Enter Sandman” and “Hell’s Bells” through stadium rafters, Marberry transformed the bullpen into theatre. He’d fill the bandboxes of the American League with the bruising echo of his fastball against the bullpen catcher’s mitt, as if to put the opposing dugout on notice: The days of Claude Jonnard and Lou North and Dave Danforth were over. You chase Washington’s starter, and things only get harder.

Then came the trek in toward the mound—confident, imposing, strides that focused everyone’s attention on a singular point. Teammates recalled Marberry bringing himself into games when manager Bucky Harris had only gone the mound to talk to his starter. There was, after all, no turning back the man named for the boxer who punched Jack Dempsey clear out of the ring.

Marberry was the first pitcher to lead the league in games pitched without starting a single one. He was the first to reach 20 saves in a season. He is still the only person to lead his league in saves more times than Quisenberry. His career saves record stood for 20 years and his season record for 24, both longer than anyone who has held them since.

In the middle of just his second full season, the Washington Evening Star already had declared the “emergency flinger par excellence” the greatest relief pitcher the game had ever seen. And he was, because the game had never seen anything like Marberry. In a lot of ways, he foreshadowed the prototypical closers of future generations. He helped form a lot of the expectations and stereotypes future bullpens were built around. But in the 1920s, when Marberry was pitching, none of those conventions existed yet.

If Marberry was where things started, Hoyt Wilhelm was the next step in the evolution. Wilhelm was baseball’s first Hall of Fame closer and the bullpen’s first true star. Jim Konstanty had won the National League MVP as a reliever in 1950, but as a 33-year-old who came out of nowhere and never repeated that success, that looked as much like a fluke as anything.

Wilhelm debuted in 1952 at nearly 30 years old, a career minor leaguer whose knuckleball, it was assumed, wouldn’t work on major league hitters. When he finally got his chance, he found himself in the bullpen for the same reason most pitchers did (and still do): Teams didn’t think he was good enough to start.

And then something strange happened: Wilhelm was actually good. His rookie year, he won the ERA title without starting a game (he is still the only player ever to do so). Had anyone realized how effective he could be, he probably never would have been in a bullpen. As it was, his stumble into stardom proved the closer could be a genuine asset.

Like Marberry’s theatrics and heavy fastball, Wilhelm’s inauspicious beginnings predicted a lasting staple of the closer archetype. For most relievers-even closers-the bullpen is only ever Plan B. Dennis Eckersley was in his twilight as a starter before rekindling his career as a closer. Mariano Rivera was a starting pitching prospect struggling with fingernail tears and developing an offspeed pitch when the Yankees shifted him to the pen. Hoffman himself was originally a minor league shortstop who couldn’t hit before the Reds (his original organization) thought to try him out on the mound.

Quite a bit changed between Marberry and Wilhelm, and even more between Wilhelm and Hoffman. Marberry didn’t necessarily pitch more than a lot of pre-Hoffman closers (at least ignoring the fact that he also started quite a few games)—he averaged almost exactly two innings per relief appearance over his career—but that varied wildly from game to game. Sometimes he’d come in for just an out or two, sometimes for five innings or more. There were times when he’d pitch the third or fourth inning one day and the ninth the next. His usage was dictated by whether and when the starter faltered and little else. There wasn’t much strategic impulse behind it.

That changed when closers became a regular part of the game. As teams started working on how to best utilize this new tool, patterns emerged that led to a more situational approach. You can clearly see that evolution when you compare how closers’ outings were distributed over the years:

The Firpo-style “ready at any time” ethos is generally considered a hallmark of previous generations of closers, but you can see from the graph that this meant something completely different for Marberry. Pre-Hoffman, it was common to see closers come into a jam in the seventh or eighth inning, but there were still clear patterns in how they were used. For Marberry, there were no patterns.

Starting from that purely anti-situational approach, with Hoffman we ended up at the opposite extreme. Closer use became almost entirely defined by the save situation: Start from a blank slate, get three outs with a one-to-three-run lead.

What is really interesting, though, is what happened in between.

When Wilhelm started, the rules for closers clearly were different. The year he won the ERA title as a reliever, he had to pitch enough innings to qualify. That’s like a reliever pitching 162 innings today.

When you look at his career, though, his usage is not all that different from Quisenberry’s. He had more long outings, and not quite as many one-inning appearances, but the two are far more similar to each other than to either Marberry or Hoffman.

That’s surprising. I think most people would consider Quisenberry to be more firmly a part of Hoffman’s era than Wilhelm’s, but their usage patterns suggest it’s the other way around. What’s more, it’s not particularly close.

This shows up in ways besides just how many innings they pitched in each outing. Beginning around the start of Hoffman’s career, closers overwhelmingly pitched with the bases empty to start an inning. Marberry’s usage diverged from that trend. You might expect to see a gradual shift from one extreme to the other, with each generation of closers more likely to start an inning or come in with the bases empty than the last.

That’s not what we see, though. Wilhelm and Quisenberry both split their appearances roughly 50/50 between starting an inning and coming in with inning already in progress, and Quisenberry actually came in more often with runners on base. There were some shifts in between, with closers coming in more often with runners on or more often to start an inning, but nothing close to the shift that occurred during Hoffman’s era.

From this perspective, the gradual evolution in bullpen usage looks more like a sudden lurch. There were still some changes in that initial span—Wilhelm, for example, entered before the seventh inning in over a fifth of his relief appearances. In over a third of his outings, he left before finishing the game. He never led the league in saves, and he spent a few years making spot starts or splitting time between the rotation and bullpen once he’d established himself. By Quisenberry’s era, those would all be out of place for an elite closer.

Compared to how Hoffman was used, though, those interim shifts are relatively minor. Hoffman isn’t much of an outlier, either, other than how long and how successfully he stayed a closer. Once his era hit, Hoffman-esque one-inning specialists quickly dominated the role.

The Importance of Leverage

There are three ways you can get more value out of a relief pitcher:

One , improve his actual performance when he is on the mound;

, improve his actual performance when he is on the mound; Two , increase his workload;

, increase his workload; Three, use him in higher leverage situations.

The trick is how to balance these three factors, which is tougher than it sounds because the three are often at odds.

Pitcher performance generally improves the fewer batters a pitcher has to face or the more rest he gets between outings (at least to a point), but that means decreasing his workload. Saving a pitcher for high-leverage situations means he may not get opportunities to contribute for several days and then have to pitch two or three days in a row. The erratic schedule can affect both his workload and his performance.

Every gain you try to squeeze from one area brings with it a compromise somewhere else, and it can be difficult to anticipate or even measure how these interactions work. That’s why bullpen usage has constantly evolved over the game’s history: Teams struggle to pin down the best way to balance these factors.

Of the three factors, leverage historically has been the most nebulous. Measuring a pitcher’s workload is simple. Measuring performance is a bit more complicated, but we’ve still been doing it in one form or another since the 19th century. Measuring leverage with anything other than gut instincts, on the other hand, is a pretty recent breakthrough.

This is one reason the general trend in bullpen usage has been away from maximizing the number of innings from your closer and instead maximizing the impact of each inning. Teams understood the impact of workload from the start, but it took time and experimentation to gain a greater appreciation for how leverage works.

That started to change when the save statistic was introduced in 1969 (and, more importantly, revised in 1975 to its current definition). For the first time, there was a widely used stat that acknowledged the situational impact of relievers. The criteria—protecting a lead of no more than three runs for at least one inning, or coming in with the potential tying run on deck—were a proxy for leverage.

The basic idea behind leverage is that your closer (like any pitcher) has a limited workload. You can’t just use him every day. And in order to get the most out of those limited innings, you want him as much as possible to come in games where he can influence the final result of the game. You don’t want to waste him on games you aren’t likely to win anyway or on games you shouldn’t have trouble holding onto with lesser pitchers.

In other words, you can live with fewer innings from your closer overall as long as it means those innings are focused in games where they are more likely to affect the outcome.

Theoretically, the rapid specialization that occurred with Hoffman’s era should preserve this ideal: The drop in workload when you shift to predominantly three-out save situations should be compensated by a corresponding increase in leverage. Of course, in reality we can’t just cleanly divide a season up into games where the closer is useless and games where he isn’t, so it can be difficult to tell if that tradeoff is worth it.

Leverage Index allows us to quantify these situations. Leverage Index is a measure of how much, on average, you can expect the next play to swing your team’s chances of winning the game. If the game reaches a point where your team’s win probability is prone to large swings, that means you’ve identified a game where your closer has a good chance to influence the outcome.

Modern closers usually enter the game in situations where those swings in win probability are close to double what they normally are. That is pretty good, but it is not impossible to replicate that with older usage patterns. Bruce Sutter and Rollie Fingers, for example, entered games in higher leverage situations, on average, than Hoffman while still pitching 100-plus innings a year.

The main issue is that saves were conceived as a descriptive stat, as a way to take what closers were already doing and count it up in a tidy way that more visibly credits their performance. The criteria were just a simple set of rules that set a minimum standard for what qualified—they were never meant to be proscriptive. As a proxy for leverage, they tend to break down near the fringes.

When you allow those standards to dictate usage, it distorts the value they were originally intended to capture. You allow your closer to pile up more saves than ever, but do so by discarding most value that doesn’t contribute directly to his save total.

The abrupt shift we saw with Hoffman’s generation didn’t necessarily create any more value from closers. It just created more visible value.

The Hoffman Generation

Hoffman’s generation featured three standout relievers: Hoffman himself, Rivera, and Billy Wagner. The way the three are viewed illustrates the extent to which saves came to define their role.

Let’s start with Hoffman and Wagner:

By runs allowed, Wagner actually compares pretty favorably to Hoffman. Over their five best seasons, Wagner was generally allowing fewer runs per nine while pitching more innings. Wagner also had more seasons at an elite level where Hoffman starts tailing off toward average. Hoffman pitched longer, but those extra innings didn’t necessarily add much on top of what Wagner accomplished.

And unlike Quisenberry, Wagner has no issues with his fielding-independent numbers. For their careers, Wagner has more strikeouts, fewer walks, and fewer home runs allowed than Hoffman. (Fewer walks is because Wagner pitched fewer innings—Hoffman has a lower walk rate, but that also means Wagner’s higher strikeout total is because his strikeout rate is much higher, by enough to overcome that same innings gap.) The gap in FIP is smaller than going by runs allowed or ERA, but Wagner’s career FIP- (63) is still a full ten points better than Hoffman’s (73).

Here is what Hoffman looks like compared to Rivera (excluding Rivera’s rookie season when he was mostly a starter):

Rivera is simply in a class of his own. If you threw out the better half of his career, you’d basically be left with Hoffman’s peak, and this isn’t even considering postseason performance. That doesn’t take anything away from Hoffman—there isn’t a reliever in baseball history who wouldn’t pale next to Mo—but it brings up an interesting question. Hoffman’s performance puts him much closer to Quisenberry or Wagner than to Rivera, but Hall of Fame voters, and a lot of fans, seem to see it the other way around. Why is that?

Relievers are notoriously controversial when it comes to the Hall of Fame, and their overall value is never going to stack up well against the game’s best starters. You can make a case that Quisenberry performed comparably to Hoffman or Fingers or Sutter, but at the same time it’s hard to argue he was worth more to the Royals than Bret Saberhagen.

Likewise, there are probably voters who believe Hoffman contributed less to his teams than, say, Kevin Brown or Mike Mussina to theirs, but still want Hoffman in the Hall because he represents the very best of an important role in the game. And once you have Hoffman representing the best of his era, you don’t really need to grant that kind of leeway for Wagner. Once you have Sutter, you don’t need Quisenberry.

So then the question becomes, why Hoffman if we have Rivera? And this is where I think the answer becomes clear. It’s saves. Hoffman was the first to 500. He was the first to 600. To this day, only Hoffman and Mo have hit either of those numbers. And so you start to think of those two as 1A and 1B, like the Kasparov and Karpov of pitching one inning at a time.

And why saves? Because that’s what came to define closers of their era. Ever since the stat was first introduced in the 1960s, the game had been slowly molding its bullpens around its terms. What began as a way to make sense of the contributions of the closer became its defining dogma. When that transition finished, when the save had fully crystalized as the law of the bullpen, Hoffman was the first great closer to show up fully formed in that era of three-out protectionists.

Trevor Hoffman is in the Hall of Fame because, more than anything else, he embodies the full maturation of the modern closer.

…

The bullpen hasn’t stopped evolving. Hoffman’s generation completed the transition toward the save, but our improved understanding of leverage has shown where that approach is too rigid. Three-run leads, for example, aren’t all that vulnerable even with a below-average reliever on the mound. Preventing the opponent from taking the lead in a tie game is of critical importance. In some cases, the base-out state may be more important than the inning.

We’ve started to see evidence that the evolution back away from the save is coming. In the postseason, where the stakes are higher and individual stats less important, teams don’t manage their bullpens the same way. Even in the regular season, Andrew Miller has been one of the game’s best relievers for the past six years; he has 52 career saves.

As that evolution continues, Hoffman’s save total may well become more secure, but at the same time mean less. I don’t know how we’ll view closers then. Maybe we’ll look back and see Wagner and Quisenberry and Smith as Hall of Famers, or maybe we’ll see Hoffman and Sutter and Fingers as peculiarities of their era. I don’t know.

But the mark will still be there. Next to Phil Niekro’s ballet with butterfly wings and “Smokey” Joe Williams’ freight train fastball, next to Gaylord Perry’s layer of vaseline and Greg Maddux’s blindfolded bullpen session, the mark that shows how Trevor Hoffman once defined a generation.

References and Resources