It’s PuRP season here at Purple Row, the biannual community prospect rankings vote. In the shadow of the winter meetings, we gather to crowdsource the Top 30 list of Rockies prospects.

As an admitted analytics nut (who happens to be typing this sentence from his mom’s actual basement), I love to incorporate statistical data into everything. While this might not be the preferred method for everyone, I think it’s valuable to at least be fluent with several different approaches to prospect hounding. If you’ll spare me some of your time, I’d like to give you a little primer on using statistics to evaluate minor league baseball players.

Stats and Scouting, Together

The tired debate among baseball talking heads pits the scouts versus the statheads in a proverbial battle to the death. Framing the discussion as conflict doesn’t do justice to reality. In truth, the modern approach is to blend the evaluations of scouts with the wealth of available statistical data. It’s with this premise: that the eye test and the data test work together, that we’ll proceed.

KATOH Minor League Projections Since 2014, FanGraphs’ Chris Mitchell has published his own work on forecasting minor league players. His model, called KATOH, attempts to estimate the likelihood of a player (i) reaching the major leagues and (ii) achieving several WAR thresholds (1,4,7,10, etc), using regression analysis on minor league stats. KATOH (indirectly) compares current players to historical minor leaguers, looking for trends that are predictive of major league success. For instance, Mitchell has found that a hitter’s strikeout rate (K%) is strongly correlated with MLB success. Therefore, KATOH assigns a higher rating to minor league hitters with excellent strikeout rates. For more information, and technical description of how exactly KATOH works, you can check out the following: KATOH: Forecasting Major League Hitting with Minor League Stats An Improved KATOH Top-100 List Chris Mitchell on KATOH and Forecasting Prospects

StatCorner

One of the challenges when “scouting the statline” is that most easily-available minor league data is woefully incomplete. Most websites that provide public access to the data only share the bare minimum of traditional stats, such as home runs, RBIs, ERA, and so forth. Since these measurements capture the result of play and not the process of playing, it can be difficult to (statistically) distinguish between which players are legit prospects and which are getting by with smoke and mirrors.

In contrast, Matthew Carruth’s StatCorner.com provides an array of minor league data. In addition to the “standard” stats, Carruth hosts batted ball data (like groundball/flyball/line drive rates), plate discipline stats (like swinging-strike rate, chase rate, contact rate), and a few other excellent custom metrics. I can’t express how much I love this website: it’s now my go-to resource for keeping track of Rockies farmhands.

Let’s take a little guided tour, shall we?

Homepage

The homepage is pretty simple to follow. While it’s outside the scope here, I’d like to mention that Carruth has his own catcher framing metrics, which you can access via the “Reports” dropdown.

For our purposes, I’d like to show you around the player pages for hitters and pitchers, and give you a sense of what information is available there. I’ll also overview some of the metrics you might not be familiar with.

Player Pages: Cool features

Each player page lets you toggle the display of minor league stats, and for pitchers, between stats as a starter (Show Rotation Stats) or as a reliever. In addition, the Park Adjust toggle will apply the relevant park adjustment to the stats that aren’t already park adjusted. For instance, xRA+ is already a park adjusted stat, so that doesn’t change when toggled. However, stats like strikeout rate (SO%), unintentional walk rate (nB%) are nominally displayed in raw form, but when Park Adjust is toggled, you’ll see the estimated equivalent in a neutral park.

The final option, Show Graph, is only for some (veteran) major league pitchers. It displays a graph of pitch selection and outcomes by pitch.

Another neat feature is that, for rate statistics (like AVG, SO%, etc), StatCorner will display the league-average value when you hover over the stat. This is a super useful feature! Without context, a player’s statistics are meaningless, but placed relative to league average gives a good sense of where the player actually stands.

The last feature I’ll mention here is that many stats have a brief glossary entry when you hover over the column headers. Here, Evn is displaying the percentage of pitches seen in even counts.

Player Pages: Hitters

Above is the player page for Raimel Tapia. Let’s take a look at a few of the stats you might not know about, from top-to-bottom, left-to-right:

Pwr+ : A rating for the difference between xHR (‘expected home runs’) and actual HR. More powerful hitters turn more flyballs into home runs than weaker hitters

: A rating for the difference between xHR (‘expected home runs’) and actual HR. More powerful hitters turn more flyballs into home runs than weaker hitters wOBA / wOBA* / wOBA+ : wOBA is similar to OPS, in that it tries to measure the total offensive output of a hitter, including hitting for average, power, and drawing walks. See this article from Fangraphs for more information. wOBA* is a park-adjusted version of wOBA, and wOBA+ is very similar to wRC+, where 100 is scaled to be league average, and being higher is better.

/ / : wOBA is similar to OPS, in that it tries to measure the total offensive output of a hitter, including hitting for average, power, and drawing walks. See this article from Fangraphs for more information. wOBA* is a park-adjusted version of wOBA, and wOBA+ is very similar to wRC+, where 100 is scaled to be league average, and being higher is better. rv600 , batV , paV , posV : These stats measure the player’s run-value in different categories: batting (batV), replacement (paV), and positional (posV). rv600 is the same as batV, but scaled to 600 plate appearances (a full season).

, , , : These stats measure the player’s run-value in different categories: batting (batV), replacement (paV), and positional (posV). rv600 is the same as batV, but scaled to 600 plate appearances (a full season). Swg , Cont , Ahd , Bhd , Evn : These stats measure the rate at which a batter swings (Swg), the rate at which he makes contact (Cont) and the percentage of pitches seen ahead, behind, or even in the count.

, , , , : These stats measure the rate at which a batter swings (Swg), the rate at which he makes contact (Cont) and the percentage of pitches seen ahead, behind, or even in the count. Zone - oCt : Available at the MLB level only, these stats measure the rate of contact and swings, depending on pitch location (in or out of the strike zone).

- : Available at the MLB level only, these stats measure the rate of contact and swings, depending on pitch location (in or out of the strike zone). GB% , FB% , LD% , IF% , pFB% : Rate of groundballs, flyballs, line drives, and popups. pFB% is the percentage of flyballs that are pulled.

, , , , : Rate of groundballs, flyballs, line drives, and popups. pFB% is the percentage of flyballs that are pulled. RBBIP : Similar to BABIP (batting average on balls-in-play), but including reaching base on errors

: Similar to BABIP (batting average on balls-in-play), but including reaching base on errors HR/OF : Percentage of outfield flyballs that went for homeruns

: Percentage of outfield flyballs that went for homeruns kL%, kS%, nB%: Percentage of plate appearances ending in a called strikeout, a swinging strikeout, or a non-intentional walk or hit-by-pitch

Player Pages: Pitchers

Here’s the pitcher page for Jeff Hoffman. As we did above, a few definitions:

SO% , nB% : Strikeout and non-intentional walk/hit-by-pitch rate

, : Strikeout and non-intentional walk/hit-by-pitch rate xHR : “Expected” home runs, an adjusted number that attempts to correct for “luck” in a pitcher’s home run rate.

: “Expected” home runs, an adjusted number that attempts to correct for “luck” in a pitcher’s home run rate. tRA: Similar to FIP, but accounts for batted-ball type

Similar to FIP, but accounts for batted-ball type xIP: “ Expected” innings pitched, adjusts for pitcher “luck.”

Expected” innings pitched, adjusts for pitcher “luck.” xR , xRA , xRA+ : “Expected” runs allowed, and expected Run Average. xRA is similar to FIP or xFIP. Attempts to measure pitcher skill, independent of defense and luck. xRA+ is scaled so that league average is 100, and higher is better. Read Matthew’s explainer for for information.

, , : “Expected” runs allowed, and expected Run Average. xRA is similar to FIP or xFIP. Attempts to measure pitcher skill, independent of defense and luck. xRA+ is scaled so that league average is 100, and higher is better. Read Matthew’s explainer for for information. RAA : Runs (prevented) above average

: Runs (prevented) above average Ahd - oCt : Same as with hitters

- : Same as with hitters GB% , FB% , LD% , IF% , pFB% : Same as with hitters

, , , , : Same as with hitters Str% , Ct% , SwS% , kS% : Strike rate, contact rate, swinging-strike rate, and percentage of strikeouts that are swinging.

, , , : Strike rate, contact rate, swinging-strike rate, and percentage of strikeouts that are swinging. RBBIP: Same as with hitters

Using StatCorner Data

Now that you know your way around, I’d like to leave you with a few tips on how best to use the data.

1. Focus on the “core performance indicators”

For hitters and pitchers, the most important ingredients to success are strikeout rate, walk rate, power/HR-allowed and BABIP/RBBIP. These four “core” skills represent the bulk of a player’s performance. It is extremely difficult to have major league success without being good at at least two of these things.

2. Supplement with batted-ball and plate-discipline stats

A hitter with a poor strikeout rate, normally a problem, might not be in trouble! Check his kL% (called strikeouts): it’s very high! Perhaps this player is working on improving his batting eye, and has experienced an uptick in called strikeouts as result. As his eye improves, the strikeout rate should decline.

A pitcher with a poor HR rate could be simply having a run of bad luck: his GB% is strong, and he gives up very few pulled flyballs. Expect the HR rate to normalize.

The core idea is this: use the process data from batted-balls and plate-discipline to help you understand how the player achieved the core results metrics discussed above.

3. Compare to league average

What is a strong strikeout rate in the Eastern League might only be an average strikeout rate for the Sally League. Due to differences in development, talent level, and park factors, each league has it’s own tendencies and trends that need to be adjusted for. Even though a player’s raw numbers may have changed upon promotion, the effect might not be real when measured relative to league average.

4. Use park adjustments

Unlike FanGraphs or Baseball Reference, StatCorner uses park adjustments split based on batter handedness and on each type of outcome: Singles, HR, strikeouts, etc. This provides a robust adjustment compared to the brute-force park factors used in calculating WAR. Some parks might supress strikeout rate while inflating the rate of singles. Others might be extreme for lefty home runs, but stingy for right handers. StatCorner can correct for these details on the fly, with the “park adjust” toggle.

5. Pay attention to age

From Chris Mitchell’s work, we have found that a player’s age (relative to their league) is strongly predictive of future success. Even though a player might have underwhelming stats, being two years younger than average could be an indication to take the stats with a large grain of salt. The opposite is true as well: a 23 year old who is raking in low-A is probably not as strong of a prospect as his stats would indicate.

6. Combine with scouting reports

Scouting still represents half of the picture, and you should use scouting reports to give context to statistical data. Scouts can identify issues that may be difficult to resolve, or identify strengths that may not have appeared in the data (yet).

Conclusion

Statistical analysis and projection is a valuable tool for evaluating and ranking minor league prospects. While still more of an art than a science, modern analytics have given us a much improved understanding of which players are likely to reach and succeed in the major leagues. Using empirical projection models (such as KATOH) and publicly available data (from StatCorner), you can incorporate the best-available information into your own evaluations. Having identified the “core” skills of a baseball player, you can examine a player’s development through the lens of statistics.

I’d like to give a special final thanks to both Chris Mitchell and Matthew Carruth for their excellent work, and for making it public. The realm of popular analytics couldn’t exist without their contributions. Finally, I hope you have a great holiday season, and happy PuRPing!