Introduction

A common thread in sabermetric research is looking at underlying metrics rather than purely results (Ex: strikeout rate, exit velocity, O-swing% rather than simply OPS). Recently, I’ve been thinking about going even further with this concept than strikeout/walk rates with pitchers and instead examining pitch-by-pitch data for pitcher evaluation. While this article is not about evaluating individual pitchers, it was a good exercise for me in working with pitch-by-pitch data in R.

There are several common baseball clichés when it comes to pitch location and pitch type combinations, which I will refer to as combos go forward. For example, offspeed pitches and especially curveballs up in the zone (i.e, hanging curveball) are seen as bad pitches, whereas fastballs, or really any pitch, down and away are viewed positively. Using R and pitch data from the 2018 season, I explored whether or not these cliches held up under statistical scrutiny and if there were any surprisingly effective or ineffective combos in 2018.

Methodology

Using this guide, I downloaded pitch by pitch data from the 2018 season (you can download it for every season 2008 onward) and stored it in a local SQLite database.

Baseball Savant Zone Variable, from the catcher’s perspective

Instead of using exact horizontal and vertical pitch locations, I decided to use the “zone” variable, which buckets pitches numerically according to the image above. Because I intended on grouping pitches together based on their location for evaluation, it was easier to use a built-in variable rather than bucketing myself. As the article title suggests, I only looked at pitches in the strike zone for a few reasons. Firstly, as you can see from the “zone” image, 11 through 14, or pitches outside of the textbook strike zone, are much larger than zones 1 through 9, so to bucket all the pitches in each outside zone seemed inaccurate. If I were to do so, I think I would have to create sub-zones from 11 through 14, which is more time consuming. Secondly, I think looking at the best combos both in and out of the zone is too large of a task for one article.

To evaluate each combo, I chose xwOBA as my statistic. One could argue that wOBA is more tested than xwOBA, but the correlation is so high between the two that I do not think it is a huge deal. A major byproduct of using xwOBA (or wOBA) as an evaluation tool is that xwOBA is only applicable for batted balls and not foul balls, strikes, or balls. I decided to omit foul balls from this dataset and assign swinging strikes, called strikes, and balls an xwOBA of zero. I included called balls in the dataset because, as I said previously, I only included balls within the textbook strikezone for this analysis, so any balls in the dataset were actually strikes called incorrectly. One could argue this valuation system places too much value on single strikes relative to batted balls, which is why I will shortly post another article excluding any non-batted balls and foul balls.

NOTE: This means the xwOBAs will seem extremely low, but this is only due to the reasons stated above.

I also split up the data based on pitcher and batter sides, so there are 4 different result groups. I also excluded combos with too few examples, such as left handed splitters right down the middle.

Results

LHP on LHB (Left handed pitcher on Left handed batter)

Best:

Pitch Type Zone Average xwOBA Sinker 7 (Down and Away) 0.0438 Four seam FB 7 0.0453 Sinker 4 (Middle-Away) 0.0890 Four seam FB 1 (Up and Away) 0.0943 Slider 6 (Middle-In) 0.0944

Worst:

Pitch Type Zone Average xwOBA Two seam FB 5 (Middle-Middle) 0.225 Two seam FB 6 (Middle-In) 0.221 Four seam FB 5 0.216 Sinker 6 0.207 Slider 5 0.196

The only pitch I was relatively surprised about is the slider middle-in being effective, but I can see it freezing hitters with horizontal movement on the inner corner. Lefties don’t like down and away pitches from LHP, but they love pitches center cut, which makes sense.

RHP on LHB

Best:

Pitch Type Zone Average xwOBA Curveball 1 (Up and Away) 0.0609 Slider 1 0.0683 Four-seam FB 7 (Down and Away) 0.0702 Two-seam FB 3 (Up and In) 0.0789 Changeup 9 (Down and In) 0.0887

Worst:

Pitch Type Zone Average xwOBA Two-seam FB 5 (Middle-Middle) 0.294 Sinker 5 0.284 Cutter 6 (Middle-In) 0.249 Cutter 5 0.242 Sinker 4 (Middle Away) 0.233

The best pitches for this split are all over the place: 5 different pitches and 4 different locations. It seems like the “hanging pitch” saying does not apply when RHPs face lefties, as long as they stay away and backdoor it. The effectiveness of the changeup down and in is interesting, especially with the adage of down and in pitches being easy for left-handed hitters to drop the barrel on.

LHP on RHB

Best:

Pitch Type Zone Average xwOBA Curveball 7 (Down and In) 0.0705 Changeup 7 0.0785 Two-Seam FB 7 0.0853 Four-seam FB 9 (Down and Away) 0.0872 Sinker 1 (Up and In) 0.0888

Worst:

Pitch Type Zone Average xwOBA Two-Seam FB 5 (Middle-Middle) 0.296 Sinker 5 0.249 Cutter 5 0.246 Cutter 4 (Middle-In) 0.240 Changeup 5 0.228

Once again, avoiding the middle of the zone seems key. However, it seems like for lefties pitching to right-handed batters, they are best off going for the down and in rather than down and away like with lefties.

RHP on RHB

Best:

Pitch Type Zone Average xwOBA Two-Seam FB 9 (Down and Away) 0.0574 Curveball 1 (Up and In) 0.0589 Sinker 9 0.0637 Four-Seam FB 9 0.0642 Slider 1 0.0687

Worst:

Pitch Type Zone Average xwOBA Changeup 5 (Middle-Middle) 0.267 Cutter 5 0.263 Two-Seam FB 5 0.244 Two-Seam FB 4 (Middle-In) 0.241 Changeup 4 0.236

Again, breaking balls at the top of the zone seem to be not only non-terrible, but even effective.

Conclusion

In general, avoiding the center of the plate with any type of pitch, especially fastballs, seems like a good idea, which is obvious. Fastballs in general frequently appeared in the worst pitch lists, aligning well with the general trend of pitchers getting away from fastballs in exchange for more breaking balls. Pitching down and in to righties as a LHP also seems like an effective strategy. The most intriguing result, in my opinion, was the effectiveness of breaking pitches at the top of the zone in RHP/RHB and RHP/LHB matchups. It seems “hanging” pitches refer to those thrown that break into the middle of the zone height-wise rather than the top of the zone. I intend on exploring this topic in future articles.

This data, of course, generalizes pitches within a pitch type as the same. Pitches have great variation within their own classifications, meaning pitchers like Jacob DeGrom might be able to get away with combos that pitchers like Jerad Eickhoff cannot. Initially, I thought using combos to evaluate individual pitchers might be worthwhile but because of the above, I decided it was not a worthwhile endeavor. However, I think combos can provide insight to struggling pitching staffs on location and pitch type combinations to generally emphasize while making individual adjustments based on pitch quality and hitter tendencies.

Thank you for reading! Please comment any other baseball topics you are interested in reading about or any thoughts you had.