Motivation One of the aspects of baseball that is the hardest to quantify and evaluate is fielding ability. Most events in baseball, such as hitting events, are discrete which makes them easy to tabulate and model probabilistically. The central difficulty with fielding is that we are trying to evaluate players on a continuous playing surface where we must take into account not just whether a successful play was made, but whether a successful play was possible. The much-maligned error statistic is a subjective attempt at discretising this phenomenon: players are assigned an error if the official scorer deems that their unsuccessful play should have been successful. However, tabulating errors isn't a good measure of ability without a corresponding measure that credits a player for making a play that most players wouldn't have. Recent techniques such as Ultimate Zone Rating or the Plus-Minus system from The Fielding Bible are based on the tabulation of both positive and negative fielding events. These statistics are more accurate measures of fielding ability. However, despite being obvious improvements on previous methods, both of these approaches are still based on dividing the baseball field into discrete zones and vectors, and tabulating events within each zone. Ideally, the baseball field could be treated as the continuous playing surface that it actually is, instead of a set of zones or vectors. Instead of tabulating fielding events within discrete zones, we fit continuous probability distributions to each fielder based on their past fielding events. Methodology for Grounder Balls-In-Play (g-bip) Our raw data is from Baseball Info Solutions. For each grounder ball-in-play (g-bip), we have the (x,y) coordinates in the field where the g-bip was fielded, a "velocity" classification (ranging from 1-5) for the g-bip, as well as the number of outs made on the play. We defined any play where one out or more was made as "successful". Our evaluation procedure consisted of the following steps: 1. Estimating starting locations for each position Our BIS data does not provide a key piece of information for each g-bip: the location of each fielder before the ball was hit. We estimate the starting location for each fielder as the (x,y) location in the field where each position has the highest overall probability of making a successful play. For each grounder, we then convert the bip coordinates into the angle at which the grounder was hit off of the bat. An angle of 0 corresponds to the 3rd base line while an angle of 90 corresponds to the 1st base line. 2. Fitting smooth models for the average fielder at each position We model the probability of a successful play on a grounder as a smooth function of the angle between fielder location and the BIP path. We model different functions for each velocity category, and also allow a different function for fielders moving to the left or the right. These models are calculated using the data from all infielders, and so represent the ability of an aggregate fielder at each position. In the figure below, we show the probability model at each position for successful fielding of grounders with an intermediate velocity. We see that each position has a distinct probability model. Note that pitchers seem to have a much larger range than the other infield positions only because they are much closer to home plate and therefore do not have to travel as much distance to cover the same range of angles from home plate. 3. Fitting player-specific models and calculating differences We calculate the same probability models using only the data for each individual fielder and allowing different parameters for each player. Since we have different models for each individual player, we can quantify the difference between players by comparing their individual probabilities of making an out relative to the aggregate probability of making an out. As an example, the figure below illustrates the comparison on grounders between the aggregate model for the SS position and the individual models for the best and worst shortstops. 4. Weighted sum of player-specific differences For each possible angle, we can calculate the difference D between particular fielder's probability of success and the aggregate probability of success. A rough measure of fielder ability is the sum over all possible distances of the difference (individual player - aggregate) in probability of not making a successful play. This sum is carried out by simple numerical integration . However, since not all distances occur with equal frequency, our SAFE measures are actually calculated as a frequency-weighted sum , so that more frequent distances or angles are more important. In addition, our sum is also weighted by the average run consequence of each angle, which allows us to take into account the different consequences of grounders to different areas eg. a missed grounder down the first base line leads to more bases than a missed grounder to the shortstop. The figure below shows the different weights that go into our aggregation. Thus, for an individual player, their SAFE statistic can be interpreted as their expected runs cost/saved relative to the average fielder . A good fielder will have a large positive SAFE, which means a high number of runs saved, whereas a bad fielder will have a large negative SAFE, which means a high number of runs cost. Methodology for Balls-In-Play in the air (a-bip) Our raw data is from Baseball Info Solutions, which was also used for The Fielding Bible. For each ball-in-play hit into the air (a-bip), we have the (x,y) coordinates in the field where the a-bip was fielded, a "velocity" classification (ranging from 1-5) for the a-bip, as well as the number of outs made on the play. We defined any play where one out or more was made as "successful". Note that our balls-in-play into the air are subdivided into three different types: fly balls, liners, and pop ups. The following evaluation procedure is performed for each a-bip type separately: 1. Estimating starting locations for each position Our BIS data does not provide a key piece of information for each a-bip: the location of each fielder before the ball was hit. We estimate the starting location for each fielder as the (x,y) location in the field where each position has the highest overall probability of making a successful play. 2. Fitting smooth models for the average fielder at each position We model the probability of a successful play on a a-bip as a smooth function of the distance between the fielder starting location and the a-bip coordinates. We model different functions for each velocity category, and also allow a different function for fielders moving to the left or the right. These models are calculated using the data from all infielders, and so represent the ability of an aggregate fielder at each position. 3. Fitting player-specific models and calculating differences We calculate the same probability models using only the data for each individual fielder and allowing different parameters for each player. Since we have different models for each individual player, we can quantify the difference between players by comparing their individual probabilities of making an out relative to the aggregate probability of making an out. As an example, the figure below illustrates the comparison on fly balls between the aggregate model for the CF position and the individual model for Darin Erstad in 2002 4. Weighted sum of player-specific differences For each possible (x,y) coordinate, we can calculate the difference D between particular fielder's probability of success and the aggregate probability of success. A rough measure of fielder ability is the sum over all possible (x,y) coordinates of the difference (individual player - aggregate) in probability of not making a successful play. This sum is carried out by simple numerical integration . However, since not all a-bip coordinates occur with equal frequency, our SAFE measures are actually calculated as a frequency-weighted sum , so that more frequent (x,y) coordinates are more important. In addition, our sum is also weighted by the average run consequence of each (x,y) coordinate, which allows us to take into account the different consequences of a-bips to different areas eg. a missed a-bip into the outfield power alley has a higher consequence compared to a missed pop-up in shallow outfield. The figure below shows the major differences in consequences of a-bips to different areas of the field Thus, for an individual player, their SAFE statistic can be interpreted as their expected runs cost/saved relative to the average fielder . A good fielder will have a large positive SAFE, which means a high number of runs saved, whereas a bad fielder will have a large negative SAFE, which means a high number of runs cost. Overall SAFE Values: Combining BIP Types We have described our methodology for calculating SAFE for grounders as well as balls hit into the air (which includes liners and fly balls). For each player in each season (2002-2008), their SAFE values within each ball-in-play type are added up over all appropriate ball-in-play types. For infielders, their combined SAFE values consists predominately of grounder balls-in-play (g-bip) but also include infield flys or liners. For outfielders, their combined SAFE values are aggregated across all ball-in-the-air types (fly balls and liners). These combined SAFE values are available in raw form at the link below: Year-by-year SAFE values in Excel format SAFE Over Time My student James Piette wrote his Ph.D. thesis on "Evaluating Fielding Ability in Baseball Players Over Time". Instead of treating each player-season as independent, he combines information over time within a player. Three new models are compared: a constant-over-time model, a moving average age model and an autoregressive age model. You can download a copy of his thesis here. These SAFE values for these new time-series models are available at the link below: SAFE Over Time in Excel format The SAFE estimate columns vary between models, as the first two models are season-specific (i.e. columns refer to season) and the last two models are age-specific (i.e. columns refer to age). There are also, additional pairs of columns; for the constant-over-time model, columns related to the player-specific estimates are included and for the autoregressive age model, columns related to the initial state estimates are included. In general, the SAFE estimate columns can be read like the following: mean refers to posterior mean, int1 refers to lower bound on posterior 95% interval and int2 refer to the upper bound. Averaged Results: Infielders Below, we give the SAFE values for each infielder, averaged over the 2002-2008 seasons. Positive values indicate runs saved whereas negative values indicated runs cost. Within each position, fielders are ranked from best to worst. Only fielders for which we have enough data (at least 1000 BIP faced) are included. These averages are weighted by the number of BIP faced by the player in each year, but keep in mind that some of the values below may be based on only one or two years worth of data. As mentioned above, the full year-by-year data is available here First Baseman Second Baseman Third Baseman Shortstop 1B Joey Votto 5.29 2B Craig Counsell 10.18 3B Craig Counsell 5.86 SS Adam Everett 11.27 1B Chris Shelton 3.29 2B Chase Utley 8.51 3B Placido Polanco 5.65 SS Alex Rodriguez 10.49 1B Doug Mientkiewicz 3.21 2B Mike Fontenot 6.70 3B Sean Burroughs 4.48 SS Craig Counsell 7.40 1B Albert Pujols 3.16 2B Aaron Hill 6.60 3B David Bell 4.08 SS Clint Barmes 6.56 1B Casey Kotchman 3.02 2B Jose Valentin 6.24 3B Nick Punto 3.37 SS Erick Aybar 6.19 1B Ken Harvey 2.87 2B Orlando Hudson 6.09 3B Scott Rolen 2.98 SS Bill Hall 5.32 1B Eric Karros 1.68 2B Rey Sanchez 5.15 3B Adrian Beltre 2.91 SS Shane Halter 4.58 1B Mike Sweeney 1.46 2B Junior Spivey 5.15 3B Freddy Sanchez 2.88 SS J.J. Hardy 4.51 1B Mark Teixeira 1.44 2B Brandon Phillips 4.76 3B Jack Hannahan 2.87 SS Tony Pena 4.30 1B Scott Spiezio 1.19 2B Adam Kennedy 4.63 3B Pedro Feliz 2.87 SS Jose Valentin 4.07 1B Kevin Young 0.95 2B Pokey Reese 4.04 3B Joe Crede 2.29 SS Chris Gomez 3.93 1B Nick Johnson 0.93 2B Placido Polanco 4.02 3B Carlos Guillen 2.13 SS Troy Tulowitzki 3.72 1B Scott Hatteberg 0.89 2B Mark Ellis 3.87 3B Hank Blalock 1.73 SS James Hardy 3.41 1B Dan Johnson 0.81 2B Tony Graffanino 3.34 3B Andy Marte 1.53 SS Rafael Furcal 3.38 1B Julio Franco 0.74 2B Tony Womack 2.69 3B Chris Stynes 1.38 SS Cesar Izturis 3.09 1B Travis Lee 0.54 2B Dustin Pedroia 2.59 3B Jorge Cantu 1.33 SS John McDonald 2.86 1B Todd Helton 0.50 2B D'Angelo Jimenez 2.52 3B Ryan Zimmerman 1.17 SS Edgar Renteria 2.69 1B Carlos Pena 0.31 2B Jerry HairstonJr. 2.40 3B Chad Tracy 1.10 SS Mike Bordick 2.40 1B Lance Berkman 0.18 2B Akinori Iwamura 2.03 3B Chipper Jones 1.09 SS Julio Lugo 2.37 1B Darin Erstad 0.16 2B Fernando Vina 1.87 3B Brandon Inge 1.05 SS Juan Uribe 1.62 1B Ross Gload 0.01 2B Michael Young 1.84 3B Morgan Ensberg 0.99 SS Jack Wilson 1.48 1B Kevin Youkilis 0.00 2B Joe Inglett 1.68 3B Jared Sandberg 0.98 SS Jason Bartlett 1.23 1B Phil Nevin -0.14 2B Mark Grudzielanek 1.61 3B Jeff Cirillo 0.87 SS Yunel Escobar 1.08 1B Lyle Overbay -0.17 2B Kelly Johnson 1.49 3B Bill Hall 0.78 SS Ryan Theriot 1.08 1B John Olerud -0.19 2B Ron Belliard 1.44 3B Corey Koskie 0.69 SS Barry Larkin 0.72 1B Ben Broussard -0.21 2B Warren Morris 1.38 3B Vinny Castilla 0.61 SS Ronny Cedeno 0.69 1B Justin Morneau -0.31 2B Asdrubal Cabrera 1.33 3B Bill Mueller 0.59 SS Rey Ordonez 0.65 1B Adrian Gonzalez -0.33 2B Marlon Anderson 1.19 3B Aaron Boone 0.54 SS Nomar Garciaparra 0.43 1B Shea Hillenbrand -0.48 2B Brian Roberts 1.10 3B Maicer Izturis 0.47 SS Royce Clayton 0.39 1B James Loney -0.55 2B Jose Lopez 0.87 3B Abraham Nunez 0.34 SS Alex Cora 0.38 1B Paul Konerko -0.61 2B Carlos Febles 0.71 3B Eric Chavez 0.20 SS Jimmy Rollins 0.38 1B Jeff Bagwell -0.69 2B Josh Barfield 0.00 3B Robin Ventura 0.20 SS David Eckstein -0.31 1B Tino Martinez -0.85 2B Nicholas Green -0.12 3B Alex Gordon 0.17 SS Omar Infante -0.36 1B Kevin Millar -1.01 2B Juan Uribe -0.19 3B Wes Helms 0.15 SS Orlando Cabrera -0.42 1B Nick Swisher -1.26 2B Ronnie Belliard -0.27 3B Chone Figgins -0.35 SS Omar Vizquel -0.55 1B Jim Thome -1.34 2B Jamey Carroll -0.67 3B Geoff Blum -0.40 SS Neifi Perez -1.06 1B Ryan Klesko -1.45 2B Omar Infante -1.02 3B Shea Hillenbrand -0.43 SS Jose Reyes -1.09 1B Derrek Lee -1.45 2B Marcos Scutaro -1.25 3B Edgardo Alfonzo -0.47 SS Juan Castro -1.30 1B Ryan Howard -1.58 2B Rickie Weeks -1.28 3B Wilson Betemit -0.56 SS Kazuo Matsui -1.70 1B Jack Snow -1.66 2B Aaron Miles -1.33 3B David Wright -0.62 SS Bobby Crosby -1.80 1B Adam LaRoche -1.72 2B Alex Cora -1.56 3B Troy Glaus -0.62 SS Miguel Tejada -2.11 1B Daryle Ward -1.72 2B Willie Harris -1.70 3B Alex Rodriguez -0.71 SS Tony Womack -2.22 1B Richie Sexson -1.79 2B Marcus Giles -1.86 3B Evan Longoria -0.84 SS Cristian Guzman -2.31 1B Aubrey Huff -1.87 2B Freddy Sanchez -1.90 3B Russell Branyan -1.24 SS Alex Gonzalez -2.47 1B Rafael Palmeiro -1.92 2B Mark Bellhorn -1.97 3B Aramis Ramirez -1.25 SS Khalil Greene -3.05 1B Jeff Conine -1.94 2B Mark Loretta -2.27 3B Akinori Iwamura -1.74 SS Alex Cintron -3.52 1B HeeSeop Choi -1.95 2B Ian Kinsler -2.35 3B Joe Randa -1.88 SS Carlos Guillen -3.75 1B Sean Casey -1.97 2B Kaz Matsui -2.38 3B Melvin Mora -1.88 SS Marco Scutaro -3.86 1B Robert Fick -2.24 2B Luis Gonzalez -2.42 3B Ty Wigginton -1.92 SS Jose Hernandez -4.07 1B Eric Hinske -2.48 2B Robinson Cano -2.47 3B Mark Reynolds -2.14 SS Jhonny Peralta -4.10 1B Tony Clark -2.60 2B Jorge Cantu -2.78 3B Mike Lamb -2.15 SS Chris Woodward -4.30 1B Ryan Garko -2.74 2B Howie Kendrick -2.80 3B Mark Teahen -2.53 SS Andy Fox -4.57 1B Randall Simon -2.84 2B Tadahito Iguchi -2.83 3B Mike Lowell -2.78 SS Ramon Santiago -4.62 1B Carlos Delgado -2.95 2B Brent Abernathy -3.03 3B Eric Hinske -3.01 SS Stephen Drew -4.91 1B Conor Jackson -3.18 2B Mark DeRosa -3.15 3B Casey Blake -3.11 SS Angel Berroa -5.47 1B Nomar Garciaparra -3.38 2B Todd Walker -3.19 3B Edwin Encarnacion -3.46 SS Yuniesky Betancourt -5.65 1B Craig Wilson -3.39 2B Damion Easley -3.57 3B Tony Batista -3.67 SS Ramon Vazquez -5.80 1B Miguel Cabrera -3.85 2B Luis Castillo -3.61 3B Kevin Kouzmanoff -3.78 SS Hanley Ramirez -6.60 1B Prince Fielder -4.32 2B Jose Vidro -3.83 3B Miguel Cabrera -4.03 SS Felipe Lopez -6.71 1B Jason Giambi -4.56 2B Desi Relaford -4.11 3B Jose Bautista -4.24 SS Russ Adams -6.91 1B Fred McGriff -4.94 2B Alfonso Soriano -4.23 3B Todd Zeile -4.52 SS Rich Aurilia -7.67 1B Mike Jacobs -4.99 2B Eric Young -4.98 3B Eric Munson -4.63 SS Deivi Cruz -8.13 1B Mo Vaughn -5.12 2B Ray Durham -5.43 3B Aubrey Huff -4.97 SS Michael Young -8.56 1B Dmitri Young -5.40 2B Alexi Casilla -5.73 3B Chris Truby -5.19 SS Ben Zobrist -8.79 2B Craig Biggio -5.77 3B Garrett Atkins -6.09 SS Nick Punto -10.07 2B Dan Uggla -5.93 3B Fernando Tatis -6.81 SS Jeff Keppinger -10.96 2B Jeff Kent -6.13 3B Michael Cuddyer -7.32 SS Brendan Harris -11.22 2B Jose Castillo -6.56 3B Ryan Braun -10.55 SS Derek Jeter -12.16 2B Ruben Gotay -6.93 2B Bret Boone -7.23 2B Miguel Cairo -7.59 2B Alexei Ramirez -7.73 2B Luis Rivas -8.07 2B Ricky Gutierrez -8.35 2B Felipe Lopez -9.01 2B Roberto Alomar -9.70 Averaged Results: Outfielders Below, we give the SAFE values for each outfielder, averaged over the 2002-2008 seasons. Positive values indicate runs saved whereas negative values indicated runs cost. Within each position, fielders are ranked from best to worst. Only fielders for which we have enough data (at least 1000 BIP faced) are included. These averages are weighted by the number of BIP faced by the player in each year, but keep in mind that some of the values below may be based on only one or two years worth of data. As mentioned above, the full year-by-year data is available here Left Fielder Center Fielder Right Fielder LF Covelli Crisp 9.75 CF Andruw Jones 9.74 RF Jayson Werth 6.56 LF Matt Diaz 7.49 CF Darin Erstad 9.17 RF Alex Rios 5.06 LF Reed Johnson 6.73 CF Adam Jones 5.38 RF Dustan Mohr 4.93 LF Carl Crawford 5.45 CF Carlos Gomez 5.28 RF Franklin Gutierrez 4.93 LF Scott Podsednik 4.99 CF Alfredo Amezaga 5.25 RF Corey Hart 4.87 LF Eric Byrnes 3.96 CF Curtis Granderson 5.13 RF Gary MatthewsJr. 4.63 LF Dave Roberts 3.82 CF Exavier Logan 4.83 RF Austin Kearns 4.25 LF Emil Brown 3.54 CF Jim Edmonds 4.60 RF Richard Hidalgo 4.11 LF Brian Jordan 2.42 CF Shane Victorino 4.17 RF Casey Blake 4.05 LF Brad Wilkerson 2.18 CF Brian Anderson 3.95 RF Jeff Francoeur 4.04 LF Randy Winn 2.02 CF Doug Glanville 3.81 RF David Drew 3.73 LF Jacque Jones 1.36 CF Brady Clark 3.79 RF Randy Winn 3.67 LF Garret Anderson 1.28 CF Aaron Rowand 3.67 RF Geoff Jenkins 3.33 LF Chipper Jones 1.24 CF Mike Cameron 3.66 RF Jose Cruz 3.03 LF Matthew Holliday 1.23 CF Joey Gathright 3.03 RF Hunter Pence 2.83 LF Terrence Long 1.19 CF Jeremy Reed 2.87 RF Brady Clark 2.61 LF Alfonso Soriano 1.16 CF Eric Byrnes 2.82 RF Ichiro Suzuki 2.38 LF Shannon Stewart 1.15 CF Juan Pierre 2.62 RF Nelson Cruz 2.10 LF Moises Alou 1.13 CF Grady Sizemore 2.29 RF Matt Lawton 2.02 LF Albert Pujols 0.80 CF Jay Payton 1.83 RF Jason Lane 1.81 LF Jason Michaels 0.66 CF Chris Duffy 1.81 RF Jody Gerut 1.78 LF David Dellucci 0.63 CF Chone Figgins 1.70 RF Jeromy Burnitz 1.54 LF Jay Payton 0.58 CF Gary MatthewsJr. 1.58 RF Trot Nixon 1.38 LF Barry Bonds 0.43 CF Covelli Crisp 1.40 RF Alexis Rios 1.33 LF Pat Burrell 0.22 CF Carlos Beltran 1.18 RF Andre Ethier 1.14 LF Matt Holliday 0.00 CF Torii Hunter 1.10 RF Reggie Sanders 0.95 LF Luis Gonzalez -0.15 CF Marlon Byrd 1.06 RF Delmon Young 0.66 LF Craig Monroe -0.37 CF Willy Taveras 0.94 RF Jacque Jones 0.65 LF Geoff Jenkins -0.54 CF Dave Roberts 0.53 RF Jose Guillen 0.56 LF Kevin Mench -0.61 CF B.J. Upton 0.45 RF Emil Brown 0.55 LF Frank Catalanotto -0.83 CF Scott Podsednik 0.30 RF Bobby Higginson 0.53 LF Jason Bay -0.84 CF Corey Patterson 0.27 RF Brian Giles 0.53 LF Jose Guillen -1.34 CF Mark Kotsay 0.08 RF Bobby Kielty 0.24 LF Rondell White -1.51 CF Luis Matos 0.05 RF Sammy Sosa 0.12 LF Raul Ibanez -1.58 CF Kenny Lofton -0.52 RF Jay Gibbons 0.08 LF Lance Berkman -1.64 CF Rocco Baldelli -0.72 RF Xavier Nady -0.15 LF Carlos Lee -1.82 CF Milton Bradley -0.74 RF Nick Markakis -0.57 LF Todd Hollandsworth -2.24 CF Chris Singleton -0.90 RF Danny Bautista -0.65 LF Larry Bigbie -2.27 CF Coco Crisp -1.09 RF J.D. Drew -0.73 LF Adam Dunn -2.28 CF Nate McLouth -1.34 RF Jeremy Hermida -0.73 LF Marcus Thames -2.57 CF Chris Young -1.37 RF Shawn Green -1.16 LF Chris Duncan -2.62 CF Endy Chavez -1.52 RF Juan Encarnacion -1.20 LF Ryan Klesko -2.76 CF Vernon Wells -1.95 RF Vladimir Guerrero -1.44 LF Josh Willingham -2.97 CF Laynce Nix -2.03 RF Aaron Guiel -1.52 LF Cliff Floyd -2.99 CF Alex Sanchez -2.31 RF Raul Mondesi -1.59 LF Brian Giles -3.01 CF Ryan Freel -2.44 RF Robert Fick -1.78 LF Jeff Conine -3.50 CF Tike Redman -2.45 RF Kevin Mench -1.97 LF Matt Lawton -3.61 CF Steve Finley -2.53 RF Aubrey Huff -2.03 LF Hideki Matsui -3.72 CF Johnny Damon -2.59 RF Magglio Ordonez -2.35 LF Miguel Cabrera -5.83 CF David DeJesus -3.05 RF Larry Walker -2.51 LF Manny Ramirez -8.80 CF Terrence Long -3.08 RF Michael Tucker -2.86 LF Delmon Young -10.04 CF Preston Wilson -3.08 RF Bobby Abreu -3.19 CF Craig Biggio -3.25 RF Tim Salmon -3.25 CF Randy Winn -3.30 RF Ryan Ludwick -3.41 CF Ichiro Suzuki -4.03 RF Ben Grieve -4.17 CF Lastings Milledge -4.04 RF Craig Wilson -4.49 CF Brad Wilkerson -4.08 RF Gary Sheffield -4.85 CF Marquis Grissom -4.91 RF Mark Teahen -4.97 CF Cory Sullivan -5.77 RF Jermaine Dye -5.15 CF Melky Cabrera -5.80 RF Michael Cuddyer -7.65 CF Josh Hamilton -6.30 RF Ken GriffeyJr. -8.16 CF Ken GriffeyJr. -8.24 RF Brad Hawpe -8.83 CF Bernie Williams -8.85