Emergency!

You’ve just found out your star MLS Fantasy player isn’t in his real-life lineup for the week, and you’ve got 20 minutes before the next round of games start to pick the best replacement. You don’t have time for a full analysis of all the available players; you need to keep your breakdown to a minimum. You probably only have the ability to look at one or two key indicators to determine potential success.

I run the Twitter account @mlsfantasystats, offering weekly points predictions and advice to managers of MLS Fantasy (often abbreviated as FMLS). Often I am asked what indicators can be used to give a fantasy manager a good idea of how likely a particular player is to obtain a good score in a given week, especially if my predictions are not available for that round or if someone runs into the situation described above.

Earlier this year, I built a linear regression model which showed the correlation of the season averages for each relevant fantasy scoring category to the total points scored by a player in the next round.

In simpler terms, I used math to figure out how good individual stats such as assists, completed passes, or saves are at predicting future fantasy performance on their own.

Here are the results!





Here’s how to read this chart: Each of the numbers in the chart is called a correlation coefficient, often denoted by the letter r in statistics. It is a number between -1 and 1, with a strength of correlation indicated by the distance from 0. The farther away a stat’s correlation coefficient is from 0, the better that particular stat is at predicting the total points a player in that position will score in the next game. If the stat is close to 0, its effect on the total points a player scores is effectively random.

If you want a general sense of how easy it is to predict each position, look at the amount of green in each column: The greener it is, the easier it is to predict. Midfielders take the top prize for predictability here, with goalkeepers being the apparent loser. This shouldn’t come as a surprise, and it can be seen in the point distributions for each position.

Back to the original chart. Far and away the top indicator of fantasy performance overall is whether the player’s team is playing at home. I hope it should be obvious, but you'd better be highly confident in a player's performance in other areas if you plan on picking them on an away game. The deck is stacked against them, especially if they're defenders or goalkeepers. There are a few stats such as saves which are higher for the away team (3.6 saves/game versus 2.9), but most of the time they can’t make up for the massive disadvantage in the other stat areas for those on the road.

Now let’s break things down on a position-by-position level.

Forwards

Number of Shots per 90 Minutes is the most predictive statistic for forward point production, followed closely by the omnipresent playing at home.

And it turns out when predicting forwards' points, it's just about as important that they don't get tackles as it is that they do score goals! Both stats are tied for third in correlation, but tackles are negatively correlated to forwards' FMLS point production, meaning the more tackles they get the less likely they are to get higher scores. Forwards are primarily tasked with scoring goals, and if they’re going around making tackles, they’re probably not scoring goals as often.

Out-of-position players can sometimes be a hidden gem for your team, but if a player is listed as a forward in MLS Fantasy but playing as a fullback or deeper-lying midfielder, best avoid them. They have all the downside of being a defender with a lack of ability to get attacking points with none of the upside of clean sheet chances.

Midfielders

Key Passes per 90 Minutes is the most predictive stat for midfielders. Note that since Assists is by definition a subset of Key Passes and is highly correlated to Key Passes (r = .640), the larger set of data points for Key Passes decreases the influence of randomness and makes it more likely to be a good predictor than Assists.

An interesting thing to note is that there are fairly strong negative correlations between the numbers of Clearances and Interceptions a midfielder has and their fantasy point production. It doesn't look good for defensive midfielders. What this doesn’t determine is whether those midfielders who have a higher amount of clearances and interceptions have what’s known as a “high floor,” meaning they have low variance and are almost guaranteed to get a few points each game. Perhaps that can be another topic for another article.

The biggest surprise is that Crosses per 90 Minutes has the second-highest correlation to FMLS Points! As it turns out, midfielders’ Crosses/90 is correlated not only with their Points but even more so with other variables such as Key Passes/90 (r = .777), Assists/90 (r = .505), and Big Chances Created/90 (r = -.368). This means it inherently contains a not-insignificant portion of the prediction ability of all these variables, adding up all that predictive power to become one of the best indicators for the position.

Defenders

To put things succinctly, defenders playing at home is the highest correlation coefficient in the table, and that’s most of what you need to know.

Granted, how bad the player’s own defense is another decent indicator, along with the passing stats, but they pale in comparison to playing at home. Picking away defenders is a large risk.

Likely the most significant finding in this study is that the quality of the opposing offense (Opp xGF) is rather unimportant when choosing defenders and even less predictive than the opposing defense! Perhaps this is due to a change in style most teams have when playing away; there are many who blame at least some of home field advantage on a change to a more conservative playing style by teams when they are the visitors, making home field advantage somewhat of a self-fulfilling prophecy. The truth of defense wins silverware remains to be seen in soccer, but at least in the fantasy side of the sport, the ability of opponents to create scoring chances isn’t very useful in predicting defender performance.

Goalkeepers

Pick home goalkeepers. Period.

Honestly, every other stat in the study is garbage if used by itself to predict goalkeeper performance. With r = -0.01, even the number of saves a goalkeeper gets is essentially random when it comes to predicting FMLS points!

But if you’re going to use other stats besides home/away to choose your goalkeepers (and I strongly urge caution when doing so), note that the opponent’s defense (Opp xGA) is actually a better predictor of GK points than the keeper’s own team’s defense (Team xGA)! As it turns out, for goalkeepers, “the best offense” is a lack of opposition defense. My guess is that this phenomenon has something to do with clean sheet probabilities: the lower the quality of the opposing defense, the greater the chances of the goalkeeper’s team getting the lead and being able to sit back and defend better.

The variability inherent to goalkeeper scoring is why the keeperoo is a must-know strategy. According to Eric Thulin at MLS Fantasy Boss, the average keeperoo nets you 1.5 points for single-game-weeks just by using it because of how unpredictable keepers’ scores are! You basically get two shots at getting a good goalkeeper score, and with goalkeepers being the most variable position but the cheapest upgrade to a starter, I’d spend the extra few $$ on a second keeper to decrease the risk.

What We’ve Learned

If we were to boil this all down to a few takeaways, here’s what I’d recommend for you when you’re building your MLS Fantasy roster in the upcoming season.

Use the keeperoo. Goalkeepers are too unpredictable not to use this tactic if you have the budget. If given a choice between a keeperoo and a regular old switcheroo with outfield players, do the keeperoo if only for the inexpensive upgrade to a starter keeper from a bench-sitter.

Play home defenders on good defenses. Don’t pay much attention to their opponent.

Passes are the key to midfielders, particularly Key Passes and Crosses. That being said, midfielders who rely heavily on defensive actions for their points probably aren’t going to score as highly as their more offensive-minded counterparts such as attacking mids or wingers.

Get yourself forwards who shoot and shoot often. If you must look at their total goals so far, also look at their total tackles, because they’re about equal in their predictiveness.

And lastly, in case it wasn’t clear before, don’t use away players if you can help it. Just don’t.

One final piece of encouragement—if you can call it that. Even combining several of these variables into a multiple regression model (not shown here) still can only explain around 11 percent of the variance in players’ points. In other words, despite your best efforts, 8/9ths of your fantasy performance is either not yet understood or is just plain dumb luck. Happy lineup building!

Ryan Anderson is a data-driven insights evangelist currently working as a data analyst in Des Moines, IA. He moonlights as a certified soccer referee and provides player projections and advice for the MLS Fantasy community via his Twitter account.