Rickey Henderson is the most celebrated base stealer of all time. For 25 years he terrorized opposing pitchers, catchers and middle infielders with his pitch anticipation, aggressiveness, and raw speed. Seemingly every walk or single turned into a double. Often that double turned into a triple, and on a few occasions a triple turned into a home run.

But as the sabermetric revolution gained traction in the minds of fans and analysts, we shone the spotlight less on his raw stolen base total and more on his 80.7 percent success rate. While that rate is excellent, it ranks 11th-highest out of players with 500 or more attempted steals. It leaves us room to consider other players as the best base stealers in major league history.

The man immediately above Henderson on this list (if you round to three decimal points) also garnered attention on the diamond. Yet despite leading his league in steals six years in a row, making multiple All-Star appearances, and having a higher success rate at swiping bags, fans don’t view this guy the same way they do Henderson.

This makes sense. The man I’m referring to was less talented with the bat, and his career was shorter. But even if we focus just on stealing bases, this guy’s name never comes up.

I’m not talking about Tim Raines. “Rock” does have a higher stolen base percentage, and he did suffer in Henderson’s shadow. But Raines was vindicated when he was elected to the Hall of Fame. And as we’ll see later, his base stealing prowess may be overrated.

No, I’m talking about a third guy, one who played against both men but in their shadows. Statistical analysis provides evidence that we should celebrate Vince Coleman, not Henderson or Raines, as the the most successful base stealer of his generation.

Modeling Success

How can I make this claim? I used empirical Bayesian analysis to estimate the true-talent stolen base percentage for all three men. This technique lets us compare players across eras, with different sample sizes, and while incorporating the principle of regression to the mean. For more details about empirical Bayesian analysis, refer to my article in The Hardball Times Annual 2018 and David Robinson’s excellent book.

Bayesian analysis begins with our prior expectations of each player’s talent. I modeled prior expectations on two factors:

The season

The number of attempted steals in that season

Why the season? Because major league-wide stolen base percentage fluctuates over time:

Why the number of attempts? Because managers and coaches control the running game, and they give more attempts to successful base stealers. If a player attempts 100 steals in a season, we should expect his true-talent stolen base percentage is higher than a player who attempts 20 steals in that same season.

The following graph shows how these factors interact:

The solid line shows the most likely true-talent stolen base percentage of a player who attempts 20 steals in a season. (The dashed lines show the 97.5 percent and 2.5 percent outcomes.) If a player in 1960 attempted 20 steals, we would expect him to have a true-talent success rate of about 64 percent. If a player in 2017 attempts 20 steals, we’d expect him to have a true-talent success rate of about 74 percent.

Why the change? Refer to the graph before this one. Today’s base stealers are more successful because today’s managers understand the value of not making an out much better than their 1960’s counterparts did. Today’s managers give steal attempts to talented base stealers, whereas managers in the 1950’s let darn near anyone run. Our prior expectation of a player’s true-talent stolen base percentage must reflect this change in thinking.

A comment: Before settling on this model, I tried one that used Retrosheet data to account for stolen base attempts per opportunity instead of just raw attempt totals. I thought accounting for opportunities would normalize players who reached base at different rates. It may have, but this model produced poor results, so I stuck with raw attempt totals, which produced more realistic results.

A Hardball Times Update by Rachael McDaniel Goodbye for now.

Accounting for On-field Success

As players perform on the field, we pay less attention to the prior expectations and more to what they’ve demonstrated in real life. The graph that mixes prior expectations with on-field results is a probability distribution of the player’s true-talent stolen base percentage for that season. The distribution shows the range in which that true talent most likely exists. (We call this distribution the posterior distribution because it comes after observing what the player did.)

The following graph shows the prior and posterior distributions for Henderson’s 1992 season:

Notes:

The prior curve peaks at 76.9 percent. This is what we would expect any player’s true-talent stolen base percentage to be, knowing only the season (1992) and the number of attempted steals (59) in that season.

The posterior curve peaks at 79 percent. This is what we would expect Henderson’s true-talent stolen base percentage to be, given not only the prior expectation of 76.9 percent, but also the observed SB% of 81.3 percent.

The drop between actual and estimated stolen base percentage is regression to the mean in action. Think about the variables involved in a steal attempt. You have not only the player’s speed and acceleration, but also his first step, the pitcher’s motion, the speed of the pitch to the plate, the arm strength and accuracy of the catcher, and the ability of the shortstop or second baseman to catch the ball and apply the tag. Judging a player by his on-field success rate ignores these factors.

To find career estimates of stolen base percentage, I repeated the above for each player-season and added up the totals. Here’s what I found.

Henderson vs. Coleman

The following graph shows the posterior distributions for both players, along with labels of what each area means:

The peaks of each distribution show the most-likely true-talent SB% for each man:

Coleman: 79.4 percent

Henderson: 78.8 percent

But comparing point estimates doesn’t account for the range of probabilities shown by the distributions. As the labels show, there is some chance Coleman is the better thief and some chance Henderson is. Just looking at the graph makes it difficult to tell.

We can use an A/B test to calculate the probability Coleman is the better thief. Numerical integration tells us there’s a 70 percent chance Coleman is better. Simulating one million seasons for each player, and calculating the percentage of seasons in which Coleman’s stolen base percentage is higher than Henderson’s, gives us this result.

The following plot of the players’ joint densities shows this chance:

About 70 percent of the cloud is on Coleman’s side of the plot. That’s the chance he’s a better base stealer than Henderson.

How can this be true? The two men have nearly identical record-book stolen base rates. And I already told you steal attempts factor into the model. Henderson attempted 812 more steals than Coleman, so you’d think Henderson would emerge superior.

The answer lies in two graphs. First, recall the graph near the beginning of this article about the estimated stolen base percentage of a 20-attempt player in any given year. Notice that around 1985, expectations for stolen base percentage start to rise sharply.

Now look at the following graph, which shows the attempts and success rate per season for each player:

The post-1985 rise in expectations harms Henderson. Most of his high-attempt seasons occurred prior to 1985, but he was less successful in those seasons. By the time he began stealing bases more successfully, he was attempting fewer steals than Coleman. Conversely, Coleman racked up his highest attempts and most of his highest success rates after 1985.

Henderson also hung on a bit too long, accruing some low-attempt, low-success seasons at the end of his career. This longevity pushed his raw steals total to record-level heights but harmed his estimated stolen base percentage. Conversely, Coleman burned out instead of fading away. It’s possible he may have succeeded less often had he stuck around longer, but we don’t have any evidence this is true.

What about Raines?

Yeah, what about him? Surely his 84.7 percent stolen base success rate means his estimated rate trumps Coleman’s. Right?

Let’s find out:

Yikes. Raines’ curve peaks at 75.9 percent. That rate is far behind not only his actual rate, but also those of Henderson and Coleman. Why?

My model doesn’t believe in Raines because it sees his managers didn’t give him the green light a lot. Look at Raines’ 1979–1980 seasons and his 1993–2001 seasons (minus the year 2000 when he didn’t play). He was successful at swiping bags during those seasons, but the low number of attempts cancels the success out. These 10 seasons account for almost half his career. True, managers were more conservative in 2001 than they were in 1993, but not so much that Raines looks good.

Perhaps my model is penalizing Raines unfairly. I do find it counterintuitive that he was successful on the base paths during those years but didn’t get the green light a lot. His OBP was pretty high in the late stages of his career, so he should have had a decent number of steal attempts.

Maybe his managers those years were biased or conservative on the base paths to a degree that the model doesn’t account for, or maybe they failed to believe their own eyes. These arguments would make sense to me, and a future study could try to correct for it.

Regardless, the following graph illustrates my point. It shows stolen base percentage overperformance by subtracting estimated stolen base percentage from actual stolen base percentage:

A positive value indicates an actual stolen base percentage higher than what the model expects. By this method, Raines was overrated as a base stealer for much of his career. This fact helps highlight Coleman’s status as underrated, especially since the two played during the same era.

But I come not to bury Raines, but to praise Coleman. Take a bow, Vince. The evidence suggests you were a better base stealer than both of your more well-known peers. They may have had the all-around game to get inducted into the Hall of Fame. You, however, can rest easy knowing you out-stole them on the base paths.

References and Resources