By Sean Steffen (@seansteffen)

If you’ve ever played FIFA, you’ve probably noted the importance of a forward’s “finishing” rating to how often they finish their chances. That’s how it works in the video game, but is “finishing” a real life skill significant enough to make an impact in a forward’s goal scoring tally?

While I have yet to meet a data analyst who thinks that “finishing skill” is as relevant to goal scoring as most soccer fans tend to believe, there doesn’t seem to be a consensus in terms of whether “finishing” is a repeatable skill. In other words, can forwards depend on a superior ability to convert chances year to year?

With forwards like Gyasi Zardes (16 goals in 2014) and Cyle Larin (17 goals in 2015) bursting onto the scene by converting a high percentage of their chances on goal, the question within MLS is as important as ever. Are these players scoring so many goals because of some underlying finishing skill, or are their unusually finishing rates something closer to statistical noise?

Is finishing a skill of any importance within MLS?

One important tool we can use for answering such a question is to study discrepancies in expected goals (xG) data. Since the expected goals model is built around league averages of conversion, if finishing were a skill of any statistical note we would see a consistent out-performance of the model by certain shooters who are highly skilled finishers. But before we get into repeatability for individuals, I’d like to use goals minus expected goals (G-xG) data to look at the question in much broader strokes.

Evidence of finishing in the aggregate

One of the better articles on the subject of finishing was a study conducted by Michael Caley of Cartilage Free Captain which can be read here. In said article, Caley examined forwards in the EPL over multiple seasons and discovered that “almost all of the over-performance of expected goals among forwards comes from the performance of a relatively small number of high-volume shooters.”

Specifically, he found that among forwards, high volume shooters (which he defined as forwards with over 3.5 shots per 90) score about 15% more goals than expected, while medium and low volume shooters combined to outscore their xG by a mere 2.5%. “Shot volume predicts goal conversion in the aggregate,” Caley concluded, further explaining “better finishers get more chances.”

It’s truly excellent work by Caley and it’s an experiment which I have repeated within our data set to see if similar evidence of finishing can be found in MLS.

The parameters I used were quite simple, but slightly different than Caley’s. I aggregated four years of MLS data (2011-2014) and looked only at seasons where a forward had 1000 minutes or more, whereas, Caley put no controls for minutes. This came out to 254 individual forward seasons with a combined total of 13,621 shots, which compares favorably to the 287 individual season performances looked at by Caley which combined for 12,547 shots.

I then divided the group into the three categories specified in Caley’s article: Low Volume shooters, Medium Volume Shooters and High Volume shooters.

“Low Volume shooters” were defined as forwards taking 0-1.99 shots per game, which came out to 56, or 22% of our data set. By comparison, this group comprised 32% of Caley’s data set. The findings on this end of the spectrum happen to be the least important to this article; however, in the interest of putting all data on the table, low volume shooters in MLS fared better than their EPL counterparts, outperforming their xG by 6.8% compared to their 2%. The big caveat here is that this 6.8% is heavily skewed by our minute limiting in a way the other two groups are not.

The group actually under performs by 4.6% if you add the players with under 1000 minutes into the mix. Of course, this creates a bigger sample size than Caley had, and, given the skewing that occurs with so many sub 1000 minute players who get 4 or 5 shots, accruing xG but not playing enough minutes to let it stabilize, data gets skewed in the other direction. Given this group is unimportant to this article and the findings gleaned from the medium and high volume shooters, which I am about to go into, remain the same either way, I decided to keep a control on minutes in the interest of keeping the sample sizes similar.

“Medium volume shooters” were defined as forwards who took 2-3.49 shots per 90 within a season. We had 172 instances of this which makes up 68% of the data set. By comparison, this group made up 49%of Caley’s data set. Medium volume shooters in MLS fare about as well as their EPL counterparts, with MLS medium volume shooters outperforming their xG by an average of 7.7%, while this same group in the EPL outperformed by 7.3%.

Our final group is high volume shooters which, if you recall, is where Michael Caley found the bulk of the G-xG over-performance and strong evidence for the existence of finishing.

“High volume shooters” were defined as forwards who took 3.5 shots per 90 or more within a season. We had 26 instances of this making up 10% of the data. This group in MLS was significantly lower than in the EPL where Caley had 56 data points which comprised 18.5% of his sample. But this wasn’t the only discrepancy. One of the most significant findings in Caley’s study was that forwards in this group tended to outperform their xG by a whopping 14.8% . In our MLS data, however, the over-performance was a mere 7.2%, showing no significant differences with the medium volume group.

Does this mean that the top forwards in the EPL possess a skill that the top forwards in MLS simply do not? This is certainly possible, as there is an undeniable talent gap between these two groups. Of course, we could also be seeing an artifact of a small sample size since we only have 26 data points in the high volume category whereas Caley had 56. This sample size issue remains a problem even if we expand our data to players with fewer than 1000 minutes, while the congruity with the expanded medium volume group remains the same.

Of course, the fact that MLS has so few high volume shooters is itself a significant finding because it demonstrates that MLS does not show the same level of stratification in shot volume data. MLS has more medium volume shooters than the EPL and fewer low and high volume shooters.

Perhaps another explanation for this beyond talent is the leveling effect of MLS’s built-in parity. With all teams working under the same cap, the shot deficit between the “haves and have nots” is more level, allowing for a more standardized bell curve.