“It ain’t nothin’ ‘til I call it.” Bill Klem thus defined the role of the umpire. He would know, since he officiated major league baseball for 37 years and still holds the record for working 18 World Series.

There has been a lot of discussion lately about having balls and strikes called robotically using MLB’s Statcast technology. In response, Commissioner Rob Manfred stated, “In all candor, that technology has a larger margin of error than we see with human umpires” as reported by Patrick Saunders of The Denver Post.

That sounds like an invitation to think about the physics associated with Statcast to understand the potential sources of these errors. First, let’s just be sure we understand the meaning of the “error” as it is used by physicists.

Two Types of Error

In common language, an error is something that can be remedied. In scientific analysis, we call this type of error “systematic error.” In this case, the devices we are using do not report the correct values and, in principle, we can find the problem and correct it.

However, there is another type of error called “random error.” This kind of error is intrinsic to the measurement process. It can be minimized but never completely eliminated. Random error is associated with the fact that measurements are never perfectly reproducible.

Suppose you want to get a pitching machine to fire the ball right down main street. You immediately notice all the pitches are low and outside. You can probably adjust the machine to correct this systematic error. Now, the pitches are right down the middle, but they vary pitch-to-pitch by a few inches up, down, inside, or outside of the center. This random error can only be minimized, perhaps by using a better pitching machine, but it will never be eliminated completely.

There is evidence to suggest Statcast suffers from demonstrably systematic errors. Rob Arthur of FiveThirtyEight wrote about these systematic errors in April of 2017 in “Baseball’s New Pitch-Tracking System Is Just A Bit Outside.” He compared Statcast pitch locations this year with PITCHf/x pitch locations from previous years. He found the average systematic error in horizontal position across major league parks for Statcast was only slightly higher than PITCHf/x – about 0.2 inches. However, the vertical systematic error increased from under half an inch to almost 0.75 inches.

One would suspect Statcast is working to fix these systematic errors, and the current errors should drop over time as happened with PITCHf/x. After all, systematic errors can be fixed with better calibration, data analysis, and measurement techniques.

Let’s go back to the properly adjusted pitching machine to deal with random errors. If one fired thousands of pitches and recorded the number of pitches as a function of their horizontal (x) positions, the result would likely look like the graph below.

The most likely position for a given pitch is right in the middle. However, due to random errors, there is an ever-decreasing chance of the pitch actually turning up farther and farther away from the center. This type of distribution is called a normal distribution, and it is a common way deal with random errors because one can use this curve to estimate the probability of getting any given value (or range of values) for x.

The key parameter describing the normal distribution is the width of the curve. It is related to the standard deviation. The standard deviation for Statcast pitch locations is not publicly available. So, I’ll have to make some estimates.

The Statcast Mistake Rate

The goal here is to use the normal distribution to estimate the mistake rate for ball and strike calls produced by a RoboUmp using Statcast data. So, imagine a pitch that actually crosses the plate at a horizontal position, x, as shown below. We’ll assume Statcast has some standard deviation, so it could report the position of the ball in different locations with probabilities given by the normal distribution.

In the sketch above, x = 0 is the center of home plate. The ball actually crosses the plate at a position x. The position labeled D is the edge of the strike zone that is equal to half the width of the plate plus half the diameter of the ball. The blue curve is the normal distribution of Statcast-reported positions for this event. Note there is some chance this strike will be reported as a ball because the distribution is non-zero at locations greater than D.

Using this idea for every possible actual position of the ball, one can find the probability Statcast will report the pitch incorrectly. Below is a graph of the mistake probability as a function of actual position for a random error standard deviation of 0.25 inches.

A Hardball Times Update by Rachael McDaniel Goodbye for now.

You can see the error probability is 0.5 if the edge of the ball aligns with the edge of home plate (x = D = 9.95 inches). That is, this location is a 50/50 call. Farther from the edge, the probability of a missed call drops and is essentially zero when a pitch lands an inch away from the edge. This curve includes both x D being called a strike.

If pitches were uniformly distributed across the strike zone, the total mistake rate could be found by just adding up these probabilities. However, we know pitchers try to keep the ball near the edges of the plate. If they are actually successful, the total mistake rate should increase because the ball is more often in the mistake-prone region.

The plot above is the probability of Statcast reporting a pitch from July of 2017 in the region between the center of home plate and 16 inches to the catcher’s right. You can see pitchers are only somewhat successful at keeping the ball near the edge of the strike zone. Combining the probability of a given pitch location with the probability of a missed call by Statcast as a function of the random error standard deviation results in the plot below.

I have always heard–but have no verification of the fact–that major league umps are expected to have less than a five percent error rate. I don’t know whether this means five percent of called pitches or five percent of all pitches, but I suspect the former. Anyway, this analysis shows that as long as Statcast has small systematic errors and random errors less than about 0.9 inches, it should be as good as umpires at calling inside or outside pitches.

The Top and Bottom of the Zone

Now we should investigate high and low pitches. Ball and strike calls here are not as cut-and-dried. The horizontal piece of the strike zone is carefully and quantitatively defined by the width of home plate. The vertical strike zone is much more nebulous. The MLB definition of the strike zone states:

“The STRIKE ZONE is that area over home plate the upper limit of which is a horizontal line at the midpoint between the top of the shoulders and the top of the uniform pants, and the lower level is a line at the hollow beneath the kneecap. The Strike Zone shall be determined from the batter’s stance as the batter is prepared to swing at a pitched ball.”

It is accompanied by the sketch below:

This definition leaves plenty of room for interpretation as far as the vertical part of the zone is concerned. Many batters have a straight upward stance and move into a crouch only as they swing. Others start in a deep crouch and become more upright as they unload. Not to mention that the knee cap may be hard to spot if the batter wears loose pants.

PITCHf/x originally used poorly paid “stringers” to sit in a dark room under the stands and manually turn a dial to set the top and bottom of the zone on the video image of the batter. Saunders reports that Statcast uses the previous calls of major league umpires to build a database of the top and bottom of the strike zone for each hitter.

Isn’t that ironic? Until MLB comes up with a machine-comprehensible definition of the top and bottom of the strike zone, machines will need the assistance of humans to define the strike zone for the machines.

Other Issues

One obvious problem is, on occasion, Statcast simply misses a pitch or a hit. Although these incidents seem to be occurring less and less frequently, if it did happen, would the RoboUmp have to declare a “do-over?” Several times during the World Series telecast, the strike zone box disappeared. Of course, we don’t know if that was a Statcast failure or a production mistake.

I also noticed during the World Series on several occasions, the replay of a pitch showed the ball in a noticeably different position than the “live action” did. Again, it is not clear if Statcast is to blame or the problem was a production issue.

One last concern for using Statcast data to power a RoboUmp involves the time required to collect the video and radar data, process it into meaningful numbers, and transmit those values to a RoboUmp. When one watches a broadcast, it appears as though the system produces the results in real time. The speed and location of the pitch appear on your TV as the pitch is caught by the catcher. It is easy to forget that the broadcast has been delayed by a few seconds for the express purpose of adding those graphics.

The time for data processing and transmitting is not available publicly. However, I have noticed it takes at least one second, sometimes longer, for the pitch speed to be posted on the scoreboard in most ballparks. It is not clear if this data comes from Statcast or some radar gun positioned behind the plate. If it is from Statcast, it would be an estimate of the processing and transmission time needed to alert a RoboUmp.

Travis Sawchik has suggested that perhaps inside/outside calls could be made by the RoboUmp while high/low calls are made by the human umpire. So, when the game comes down to the winning run on second in the bottom of the ninth and the closer fires a two-strike pitch on the black, we should wait a second or two for the scoreboard to tell us whether the game is over. That can’t happen.

Of course, processing and transmission times may drop as Statcast improves, allowing more instantaneous pitch calls. Nonetheless, we’ll still have the random errors and the issues associated with the definitions of the top and bottom of the strike zone to address.

I guess we’ll leave the last words to Bill Klem, who once replied to a rookie pitcher complaining about the strike zone, “Son, when you pitch a strike, Mr. Hornsby will let you know.” The point is that, for now, as the Commish says, calling balls and strikes must remain a human endeavor.

References and Resources