I don’t believe in soul mates. I don’t think that there is one perfect match out there for everyone or that most of life is just a quest to find them. But just because there isn’t a perfect fit doesn’t mean there isn’t a best fit. You can’t win every hand of blackjack, but you can still play a strategy that’s better than any other strategy. This is a baseball website, so when we put this in the context of baseball players, we find the Pittsburgh Pirates and A.J. Burnett. While the Pirates and Burnett aren’t soul mates, they’re as pretty close to an optimal fit as you’ll find in the game today.

This isn’t a new discovery. There was plenty of attention paid to Burnett’s career rejuvenation in the Steel City as it occurred, and Dave was kind enough to remind us of that fact when he reviewed the Pirates one-year contract with the pitcher this last December. Dave’s thesis, unsurprisingly, was that this reunion made sense because of how well the player matches the team, but also that Burnett was another year older and would probably not be the pitcher they remembered from 2012 and 2013.

Dave was half right, at least so far. Burnett and the Pirates are helping each other in 2015, but instead of easing into retirement, Burnett is having arguably his best season as a professional. Certainly, it’s wise to factor in some regression to his 2014 numbers, but even if you do that, he’s right on track to finish 2015 at a very similar level as his previous two seasons with the Bucs.

No one is terribly surprised that Burnett’s having more success now that he’s back in Pittsburgh after a year in Philadelphia, but the fact that he looks almost perfectly back on track offers us an opportunity to explore the the complex relationship between a pitcher and the context within which he’s working.

I would assume that many regular readers are thinking about a couple of new statistics available at Baseball Prospectus developed by Jonathan Judge and company. There’s cFIP, which debuted at The Hardball Times, and DRA, which came later at BP. While I don’t intend to explore in depth the exact nature of each statistic, it’s clear that they perform well when stacking them up against other run estimators in terms of prediction and explanation. Their claim to fame is that they attempt to control for context more than anything else out there. In other words, if A.J. Burnett were the same guy in 2014 and 2015, cFIP and DRA should be able to strip out the differences between the Pirates and the Phillies around him. That isn’t what the data shows.

Year ERA- FIP- xFIP- cFIP DRA 2012 92 92 87 97 4.33 2013 91 76 77 81 3.64 2014 126 110 107 114 4.40 2015 59 77 86 92 3.43

This isn’t a critique of the stats, as it’s totally plausible that Burnett actually pitched worse in 2014. DRA and cFIP might be measuring exactly what they claim to be measuring and the underlying data generating process was simply that Burnett was worse. But this is an interesting test case because we have a decent natural experiment before us. Burnett pitched for one team, then another, and then came back to the first team. He got older and some of the players changed, but a lot of things stayed the same.

In other words, I think cFIP and DRA do a pretty good job controlling for some of the most important contextual factors on their own. They include things like quality of competition and the average quality of the defenders on the field in their model (among many other things) and those are important factors. Stats like FIP and xFIP are a little less ambitious in that respect, but they’re easier to manage and similarly useful.

The cohort of stats we’re looking at controls for opponent, park, and defense in a variety of ways, yet we still observe the performance decline and rebound all the same. The BP stats control for catcher and umpire and use run expectancy to strip out the bullpen. There are a lot of factors that are being held constant in these models and yet we still observe the 2014 decline.

We can never actually determine what caused Burnett’s 2014 season, but as we look at his 2015 numbers and observe the rebound, it’s worth at least thinking about how exactly the Pirates make him better and if we can truly capture that in a single, composite way. The argument isn’t that we shouldn’t control for any of these factors, it’s that when we try to model them all together, we lose a good understanding of how they operate at a very fundamental level.

Let me try to summarize the argument a little more clearly. We have stats that try to strip out a pitcher’s context so that we’re measuring only his specific contribution to a given set of innings. We’re trying to determine how much credit he deserves for the runs prevented or we’re trying to measure exactly how good he is at a given point in time. What I think we gloss over in this discussion too often is that a pitcher might actually pitch worse in a context that is worse.

It’s not just a matter of the same pitch leading to a worse outcome because the catcher can’t frame or the defense can’t field, it’s a matter of the pitcher throwing a worse pitch because everything around him is different. The interactions between all these factors are very interesting, and even the rather advanced models used in DRA and cFIP can’t capture that. We should absolutely care if a catcher can steal his pitcher an extra strike, but the fact that he’s pitching to a good framer in the first place changes how he behaves as a pitcher on top of that.

The same is true with regard to defense and ballpark. Burnett likely pitches differently because of how aggressively his team plays defense and he throws a different set of pitches knowing he’s playing at PNC Park compared to Citizens Bank Park. Context is a two-way street. It’s a feedback loop. Borrowing a term from game theory, we might say that Burnett is playing his “best response” given the actions of the actors around him.

A good framer probably helps the runs allowed number for every pitcher, but while I don’t have hard data to back it up yet, I would wager almost anything that good framers help different pitchers very differently. The same is true for defense. Lorenzo Cain is more valuable to certain pitchers while Jose Iglesias is more useful for others. Both are great, but their effect isn’t one size fits all. Just look at his called-strike heat map from 2014 and 2015. As you can see, the red-colored area expands — particularly downward in the zone. Burnett has to know how much better Francisco Cervelli is and presumably he’s aiming differently as a result.

I wish we had better data with which to leverage this idea, but for now it has to remain more theoretical than empirical. We can do a fine job stripping out context and assigning credit in the aggregate. If you have a good park factor, you can apply it to a full season of data and have a pretty good estimate of the park’s effect on the pitcher’s overall season, but what you really want to know is how the park affected the flight of each batted ball. And beyond that, wouldn’t it be fascinating to know how much being in the park itself impacted the choices the pitcher made and his execution? There’s a good argument to be made, for example, that Phil Hughes started throwing more strikes in 2014 precisely because well-hit balls wouldn’t carry the fence at Target Field like they would at Yankee Stadium. And those extra strikes led to other, unrelated benefits, as as well.

This isn’t to suggest that I think the stats improperly dinged Burnett in 2014 and failed to capture something about his context. Instead, I’m arguing that a pitcher’s true talent is actually context dependent. We know the observed performance is context dependent, but Burnett might actually be better as a Pirate because it’s an environment that is ideal for him. Other pitchers might not excel there and would be better served with the Mets or the Astros.

The idea of context-neutral stats is good for assigning retroactive credit. It’s really the only way to do it. But I think it’s reasonable to argue that if you take the same pitcher and put him on all 30 teams, you wouldn’t just see random differences and differences due to different defensive plays, you’d actually get the pitcher to behave differently.

This is something that makes studying baseball challenging and also exciting. Even when researchers leverage an advanced statistical framework like mixed models and include controls for a huge number of things, you’re still not really even scratching the surface of the complicated interactions that go into every single pitch.

The context Burnett stepped back into this year is largely independent of him in that it would have looked roughly the same if he had chosen to retire. But when he steps onto the mound, the catcher calls certain pitches based on Burnett’s strengths and the defense aligns in a certain way as well. Then Burnett responds by making pitches based, in part, on his catcher, defense, and park (to say nothing of the opponents). It’s not just the outcomes that are context the dependent, I think the inputs are two.

There are probably a lot of much simpler reasons why Burnett is having a better year. He dealt with an injury last year, even if he did throw 213 innings and didn’t seem to experience a huge loss of stuff. He may have faced better prepared opponents, or our old friend random variation might have just caught him at a bad time. We can’t really know, but the fact that Burnett is pitching so well in 2015 after coming back to Pittsburgh, even according to the metrics which try to remove all sorts of contextual factors, makes you wonder just how important it is for certain pitchers to find the right fit.