It was a big awards controversy, at least until the actual votes were revealed. The 2016 American League Rookie of the Year race was coming down to two players: Michael Fulmer and Gary Sánchez. Fulmer, of the Detroit Tigers, pitched for five months and 159 innings, posting a 3.06/3.76/3.95 (ERA/FIP/xFIP) slash line that was good if not spectacular. Sánchez, catching for the New York Yankees, played just the last two months (plus one game in May), but his .299/.376/.657 slash line and strong defense were almost spectacular enough to drag his team back into a playoff chase they’d bailed on at the trade deadline.

The question wracked pundits’ brains. When choosing a Rookie of the Year, should a full season of steady goodness win out over two months of meteoric excellence? Partisans of the latter cited how Willie McCovey copped the award with a short rookie campaign almost exactly as long as Sánchez’s. Skeptics, like FanGraphs’s David Laurila, said McCovey lucked out from different qualifying criteria and a weak field. Analysts were preparing to get almost as angry about this as about Mookie Betts’s looming MVP victory over Mike Trout.

As we have seen, funny things can happen when the ballots get counted. Fulmer took 26 of the 30 first-place votes, burying the incipient controversy with his landslide. The Trout affair turned out all right, too. (There was another contentious election around this time, but I ignored it as much as humanly possible.)

Did the Rookie of the Year voters make the right call? And what defines the right call? Should it be the best player that season, or the candidate who will have the best career? Do voters lean against pitchers because their career prospects are more dicey? And should they?

To answer these questions, and others that popped up as I was crunching the numbers, I looked over the history of Rookie of the Year voting.

The History

There have been 140 Rookies of the Year. The award began in 1947, with one winner voted for both leagues. (Jackie Robinson won in 1947, Alvin Dark in 1948.) Two years later, the award was divided to provide one winner in each league, and it’s been so ever since.

Starting in 1948, the ballot was a straight vote for the best rookie: no second-place or lower slots existed. This made ties a plausible outcome, and it ended up happening twice in four years. The 1976 NL race ended with a tie between pitchers Butch Metzger and Pat Zachry, while the 1979 AL vote ended with John Castino and Alfredo Griffin tied. This brought a ballot change, with three slots provided on a 5-3-1 point scale. There have been no split awards since.

I recorded the top three finishers, plus ties, for all 138 Rookie of the Year races—at least those races that had three finishers. Back when there was just one ballot slot, there were numerous times when fewer than three players received votes. Voting was unanimous in four instances*. Other times, ties for third could cause my list to stretch to four or even five players. The 5-3-1 ballot begun in 1980 has made such logjams far less likely.

* Frank Robinson, 1956 NL; Orlando Cepeda, 1958 NL; Willie McCovey, 1959 NL; Carlton Fisk, 1972 AL.

For the top-three candidates, I collected WAR figures both for their rookie years and for their careers. I usually use just Baseball-Reference’s version of WAR, for convenience’s sake, but this time I included FanGraphs’s version as well. I am glad I did, because bWAR and fWAR produce divergent results, which I’ll be noting throughout. (Besides, the notion of WAR, any version, being a definitive measure of a player’s worth is perhaps too deeply entrenched right now.)

I stopped collecting career WAR with the 2000 Rookies of the Year, because at that point we start getting into careers that are still ongoing, making career value a more speculative matter. There is one pre-2000 RoY high finisher still playing (Carlos Beltrán), but I left him in because it’s quite clear no competitor from the 1999 AL vote is going to catch his career WAR mark. (That is also true for the 2001 RoY winners, Albert Pujols and Ichiro Suzuki, but I drew my line and I am sticking with it.)

The Breakdowns

If you want to be Rookie of the Year, you generally want to be a position player. Of the 140 awards given, 102 have gone to position players and 38 to pitchers. The ratio, though, has not been constant through the history of the RoY award.

R-O-Y AWARD WINNER CATEGORIES BY TIME PERIODS Player Type 1947-’55 ’56-’66 ’67-’80 ’81-2000 ’01-’16 Position Player 11 19 19 32 21 Pitcher 5 3 9 10 11

The first time period listed above is before the Cy Young Award was introduced; the second is during the years that only one Cy Young was given out. The third runs until the RoY ballot was changed, which may have affected the votes in subtle ways. The final break is just for the nice round number.

Before the Cy Young, pitchers won about one in three RoY Awards. In the single-CY era, that ratio plunged, only to rebound when the second CY was added. (Perhaps the CY expansion to match the MVP Award gave voters a sense of normalization, that pitchers were “just like” other players after all.) The ratio slipped for a time after the ballot expanded, but the 21st century has seen it rise back to its peak level.

Relatively speaking, this is a good time to be a rookie pitcher if you like trophies. The last eight years have been particularly great: Seven of the 16 RoY Awards from 2009 on have gone to hurlers. The span’s a bit cherry-picked but still encouraging for pitchers.

How many of the voters’ choices also have been the metrics’ choices? It depends whether bWAR or fWAR is your favored metric, because the two have different attitudes toward pitchers and position players. In this table, I show how many award winners had the highest WAR scores for their rookies seasons among the top three in the voting that year.

R-O-Y WINNERS WITH TOP ROOKIE WAR SCORES Metric Pos. Players Pitchers Total bWAR 46 26 72 fWAR 56.5 18.5 75

Co-RoY winners count as full; co-WAR leaders count as half.

The overall totals end up close, a little over half the RoY winners having the WAR lead by both measures. The divergence comes with positional breakdown. bWAR is much more likely to credit pitchers with production figures that justify their awards. Over two-thirds of pitcher RoYs, 26 of 38, lead their top threes in bWAR, while fewer than half of position players do, at 46 of 102. fWAR is tougher on the RoY pitchers, giving just under half (18.5 of 38) the WAR lead while 56.5 of 102 position players get it.

The fWAR percentages (48.7 and 55.4) are much closer together than those for bWAR (68.4 and 45.1). Does this mean fWAR is “fairer” to both groups? It’s really impossible to say without making a judgment on whether bWAR or fWAR represents true player values better—and why should my say-so decide an argument that’s gone unresolved all these years? What I can say is that those percentages reflect the two systems’ general valuation of RoY candidates.

AVERAGE WAR FOR R-O-Y CANDIDATES Type (# Candidates) bWAR fWAR All (404) 3.14 2.83 Position Players(272) 2.92 2.82 Pitchers (132) 3.60 2.84

bWAR values pitcher RoY candidates two-thirds of a point higher than position players, while fWAR has them virtually even.

The voters themselves seem to have some bias against the moundsmen. From 1947 on, 132 pitchers have filled the 404 spots for top-three (plus ties) finishers, or 32.7 percent of the candidates. Pitchers have made up just 38 of the 140 winners, or 27.1 percent. This tilt has eased in recent decades, even as pitchers have accumulated less individual value due to the spreading of workloads. From 1996 on (dating after the 1994-95 strike), 35.7 percent of RoY candidates were pitchers, against 33.3 percent of the winners.

Next, I will take the opposite view. Instead of seeing how often the Rookie of the Year led the candidates in WAR, I’ll look at how the WAR leaders (among the top three, of course) finished in the balloting. Co-WAR leaders are counted as one-half, while ties in the RoY voting are resolved to the higher spot. (E.g., a three-way deadlock for third counts as a third place for all three.)

ROOKIE WAR LEADERS PLACING IN R-O-Y BALLOTING Metric/Player Type First Second Third bWAR/Pos. Player 46 26.5 7 bWAR/Pitcher 26 19 13.5 bWAR/All 72 45.5 20.5 fWAR/Pos. Player 56.5 29 11 fWAR/Pitcher 18.5 13 10 fWAR/All 75 42 21

There’s a clear pattern across both WAR methods. A position-playing WAR leader is a very good shot to win Rookie of the Year and quite unlikely to be dumped down to third. Pitchers, on the other hand, have worse odds to take the trophy and bigger chances to end up third with their leading WAR numbers. The voters seem not to fully respect the best WAR if it comes from the hill.

It’s arguable that this is the correct attitude for them to take. Pitchers have a higher washout rate than position players, due to their greater vulnerability to career-wrecking injury. If a Rookie of the Year vote is meant, not just as acknowledgment for this season, but as a vote of confidence in the career to follow, rookie pitchers would trade at a discount in the voting.

Whether this is the way RoY voters should be thinking is, of course, another matter. We are familiar with the debate over whether MVP candidates should be judged solely on individual performance or have their team’s success taken into account (on the grounds that an improved, but still losing, record is not all that valuable).

There is strong evidence that voters didn’t do this in the early years of the award. The very first winner, Jackie Robinson, was 28 in the 1947 season, his future career already significantly circumscribed by his long wait to reach the bigs. Other ex-Negro Leaguers who took the award in the following five years included 33-year-old Sam Jethroe and 30-year-old Joe Black.

Some of that attitude survived at the turn of the millennium, creating a bit of controversy. There were those who argued that Japanese transplants to the majors, Hideo Nomo being the first major one, should not get full consideration for the Rookie of the Year Award. They already had played several years in their home country’s top league, the argument went, making an apples-to-oranges comparison to youngsters just coming up from the North American minors quite difficult. (A like argument could have been made for those rising up from the Negro Leagues. If it was, it didn’t stick.)

The voters dismissed this reasoning. Nomo was the 1995 NL Rookie of the Year, and was followed by countrymen Kazuhiro Sasaki (2000 AL) and Ichiro Suzuki (2001 AL). Voters perhaps might want their vote back for Sasaki, 32 years of age in 2000, as he had just three more seasons to come in the majors. (However, the second- and third-place finishers, Terrence Long and Mark Quinn, also had modest careers, so there wasn’t an obviously better candidate.) By now, I do not think anybody begrudges Suzuki his RoY Award.

This has been a long, tangential way of introducing how well RoY results predict career success. First, let’s see how often the Rookie of the Year winner has the best career of the contenders. I count full career WAR, including for the rookie seasons. One also could do it by subtracting the WAR accumulated in the rookie year (and in any previous service time), to get just future value, but that’s a lot of complication for a slight benefit. Numbers in parentheses are RoY winners.

R-O-Y WINNERS WITH BEST CAREER WAR Metric All (108) Hitters (81) Pitchers (27) bWAR 56 46 10 fWAR 52 41 11

There were 308 players who finished third or better in RoY voting (what I’ve been calling “candidates” or “contenders”) out of 106 races, producing 108 awards, through my cutoff date of 2000. Thus, 34.4 percent of contenders would end up the best career performer in their year and league, or 35.1 percent if we figure in the two ties. The actual rates are 51.9 percent by bWAR and 48.1 percent by fWAR. That’s a distinct improvement from random but still no better than a coin flip.

(Do recall that this does not count players who don’t finish high in the voting. For example, in the 1951 American League, Gil McDougald and Minnie Miñoso were the only vote-getters for Rookie of the Year. Shut out was a player who struggled badly enough early that he was temporarily demoted back to the minors. That player was Mickey Mantle, whose career WAR outpaced McDougald’s and Miñoso’s put together. There are plenty more outstanding players who, due to bad timing or just a rough first year, never placed high in RoY voting.)

The news is worse for rookie pitchers. Just 37 percent of their RoY winners had the best careers by bWAR, and 40.7 percent by fWAR, which is not that far from random. For the position players, it’s 56.8 and 50.6 percent, respectively. Ironically, fWAR, previously more hostile to pitchers than bWAR, is more forgiving in this instance.

Do we get better predictions if we use rookie-year WAR figures instead of the trophies?

ROOKIE WAR LEADERS WITH BEST CAREER WAR Metric All (108) Hitters (81) Pitchers (27) bWAR 47.5 29.5 18 fWAR 57.5 41.5 16

Co-WAR leaders count as half.

Results are mixed. The yearly bWAR figures are poorer predictors than the voters, but the fWAR numbers are better. Taken all together, they are a little worse at predicting future achievement than the voters.

They do substantially better with pitchers, reversing the results for award winners. Two-thirds of pitchers by bWAR, and 59.3 percent by fWAR, converted rookie-year WAR leads into career leads. That stands against 36.4 and 51.2 percent, respectively, for position players. The bWAR position players are now effectively no better than random, though the fWAR figures held about even with the award winners’ results.

If award voters are discounting rookie pitchers in an attempt to pick the best future performers, they’ve made a big mistake. Pitchers whose WAR outperforms their award competitors have a very good chance of producing the best careers. It’s a better predictive method than any other breakdown, for pitchers or position players.

The Original Question

I wanted to answer general questions with this data dive, but I hoped to answer a specific question as well. Did the voters make the right call in making Michael Fulmer 2016’s AL Rookie of the Year over Gary Sánchez? Was he the right choice for his 2016 performance, and for the career performance we can anticipate he will have?

Let’s look at the numbers. These are the 2016 WAR figures for Fulmer, Sánchez, and Cleveland’s Tyler Naquin, who finished third in the AL RoY race.

2016 AL ROOKIE OF THE YEAR TOP THREE, BY WAR Metric Michael Fulmer Gary Sánchez Tyler Naquin bWAR 4.8 3.0 0.8 fWAR 3.0 3.2 2.5

I included Naquin for completeness and to show that his WAR numbers left him out of the discussion. We see the disagreement between the two systems again. bWAR makes Naquin an afterthought, while fWAR considers him at least within shouting distance of the two leaders.

The historical bias of bWAR toward pitchers, or the fWAR bias against pitchers, shows up in spades. The margin between Fulmer and Sánchez differs by a full two WAR depending on the source. This leaves bWAR strongly agreeing with the RoY voters’ verdict, while fWAR almost apologetically disagrees.

Which one has the better career prospects by the methods we’ve seen here? (We are omitting influences like age and the attrition working against catchers like Sánchez. You know, the things we would normally consider.)

By having won the award as a pitcher, Fulmer has roughly a 40 percent chance to have the best career of the three, which is not a big gain from random chance. By leading the contenders in bWAR, Fulmer historically has a two-thirds chance to lead in career bWAR. Meanwhile, by edging ahead in fWAR in 2016, Sánchez is considered about a 50-50 shot to lead in career fWAR.

Perhaps predictably, no very clear answer emerges. Winning the award got Fulmer little prognosticating advantage, but the split decision in WAR measures leaned his way. I’d consider Fulmer the favorite by these measures, though perhaps not to the level of even odds or better. It could well end up another split decision, Fulmer having the best career bWAR, while Sánchez leads in fWAR. Or it might be the reverse, or one of many other permutations among the three candidates.

And nothing assures that “best” will equate to “great” or even “good.” In six instances from 1947 to 2000, no top-three contender in a RoY race has managed to crack 10 WAR for his career. (The number is the same for fWAR and bWAR, though some races are different.) Given the 106 races that covers, it’s a pretty low failure rate, but it’s a good caution to have.

Conclusion

I haven’t proven anything much about one single race, which is no big surprise. I did find interesting patterns in the overall data, however. A big one was that the two leading WAR measures, bWAR and fWAR, judge position players and pitchers quite differently. This presumably goes well beyond seasonal and career totals for Rookie of the Year contenders, something to consider any time one is comparing the two groups of players.

I found that pitchers receive less Rookie of the Year consideration than their WAR production would seem to merit. If voters are doing this to adjust for the poorer career prospects pitchers are supposed to have, it is a mistake. RoY-contending pitchers who lead their year in WAR tend strongly to have the best careers among the contenders as well. Measured against rookie-year WAR numbers, though, the voters are marginally better at predicting the best future players with the award they give.

Michael Fulmer won his Rookie of the Year Award in an era when pitchers are receiving historically improved consideration for the award, and the WAR method that likes pitchers more deemed him worthy of it. As for whether he justifies his hardware with a great career, or one of his competitors makes future fans wonder how the voters could have goofed up that badly…well, that’s what the next 10 or 15 years are for.

References and Resources

FanGraphs and Baseball-Reference for player WAR figures; additional credit to B-R for Rookie of the Year voting results and some player bio information.