Josh Bell had a pretty interesting 2019 season at the plate. He was one of the best hitters in baseball for the first three months of the year, but could not replicate near that same success in the second half. He finished the year with a very strong hitting line, but he is a bit harder to project for 2020 because of the radical first half vs. second half splits he posted.

I took this opportunity to dive into the MLB Statcast data set and see what I could find about Josh Bell’s 2019 season. There are countless baseball statistics resources out there, but having some coding knowledge in concert with this massive data set gives us seemingly limitless options on how to dig in.

The full results of this study are in this Python notebook hosted on Google Colab:

https://colab.research.google.com/drive/1rpOQQ4anZ3-hzHy2y7KHrfqHVTCeY6X_ If you don’t know anything about scripting, that page probably won’t make any sense to you. There are annotations explaining everything I did in general terms in the notebook to make it a little bit more clear, but that will only really be enjoyable to look at for people who know some Python (or other scripting language) already. Having this data set and a Python environment to script in like Google Colab gives us all kinds of great opportunities to look at baseball in a brand new way — and I hope somebody can get as much of a kick out of it as I have.

The data set contains just about everything you could possibly want to know about every single pitch thrown in the baseball season. For full details about every data point available in this data, check out this page.

To give you a brief idea of what we are looking at here, here is a condensed version of a randomly chosen row from the data:

All of the categories on the left are columns, and on the right are the row values for this pitch. It was a changeup thrown from Fernando Rodney to Brian Goodwin, which Goodwin hit for a single at 73 miles per hour. There is way more than just this information given, but that should give you some idea of what we’re working with.

I wrote two functions to help us learn from the data.

A function that calculates the batting statistics given a chunk of rows. You can send in any grouping of rows and it will tell you all kinds of information about the hitter’s performance from that data. A pitch plotter. This takes the location coordinates and pitch type information from the data and plots every pitch in whatever rows you feed it.

To show examples of the outputs of these, here first are the league average values, found by feeding in all rows through the first function:

And here are all the pitchers that Gerrit Cole through in his first start of the season, shown from the perspective of the catcher.

You can send any customized group of rows through these functions and get these results, making it easy to find how hitters and pitchers fared in any situation you can imagine.

What We Learned about Josh Bell

When you read through the Python notebook, you will see that our key findings were these.

Bell was a significantly better hitter against right-handed pitchers last year, and he faced substantially less right-handed pitching in the second half of the season (33% of his pitches seen were from lefties in the second half, while that number was 25% in the first half).

He slugged .778 against curveballs in the first half, but just .467 in the second half. He saw more curves in the second half (10% compared to 7%), and those curves had five inches more vertical movement on average.

Plate discipline actually improved for Bell in the second half, as he swung less (46% in the second half, down from 49%), walked more (16% compared to 12%), and whiffed less (10% compared to 12%).

However, Bell did not hit the ball nearly as well when making contact in the second half, with a 10.5% barrel rate in the second half after posting an elite 14.3% in the first half.

Bell still hit a good amount of balls at an optimal launch angle, but he hit way more balls directly into the ground (around a -20 degree angle) in the second half. This undoubtedly hampered his power numbers.

It seems that there is reason for optimism for Bell having another strong 2020 season, as his plate discipline was very strong all season long. The key to being a strong hitter is to put a lot of balls in play while continuing to take walks and barreling balls at a good rate. Bell did all of those things in the second half, just not nearly as well as he did in the first half.

I will be doing more of this kind of Python Notebook analysis about the 2019 season, and about the new 2020 season if we have one. Requests are encouraged, there really is nothing that we can’t find with this incredible data set.