Week 3 NPL Analysis: “... Some circles are more equal than others.” (Part 2)

​

This week I hope to definitively answer how many points each circle is worth. But first, let’s look at the NPL Royale.

​

NPL ROYALE “LUCK” PLOTS

​

Here’s how teams performed relative to their circle favor in all matches in the NPL Royale, including both the group stage and the final stage. This is kind of a mess unless you’re looking for one specific team!

There’s a ton more divergence from the average points per match here compared to the plot for the NPL so far, which you can see here – which makes sense, because a longer series of matches will give a clearer and more consistent picture of a team’s performance.

Simplicity in particular really outplayed their circle favor. C9 put up a great performance with much less circle favor than usual, and Envy also killed it.

Figure 1: Total points per match for each NPL team based on circle favor, for the Phase 2 NPL Royale. Teams that are above the black line performed better than expected, and teams below the line, worse. The gray lines represent one standard deviation from the mean in points per match.

​

Here’s how things went in the finals, with a dotted trace from each team’s group stage performance for comparison. It’s very apparent in this plot how circle luck and points per game can swing wildly from match to match if there are only a few matches in a tournament. Tempo and SSG stand out here.

Figure 2: Total points per match for each NPL team based on circle favor, for the Phase 2 NPL Royale Finals. Teams that are above the black line performed better than expected, and teams below the line, worse. Dotted lines indicate change from the group stage to the finals. Gray lines represent one standard deviation from the mean in points per match.

​

HOW MANY ADDITIONAL POINTS DO YOU GET FOR BEING IN ANY GIVEN CIRCLE?

​

Is it possible to figure out the exact influence of each circle? How many points is it worth to be in any particular circle?

Last week, I started to try to answer these questions. The ridgeline plot of the difference in points between getting and not getting each circle seemed to show that circles 4, 5, and 6 are the most important.

I attempted to fit a linear regression model that separated out the effect of each circle on a team’s total points in a match, with no success – at first.

When I recorded data from each match, I used a dummy variable encoding of each circle as a predictor variable, i.e. a team was either in (coded as 1) or out (coded as 0) of each circle. I then tried to use those values to predict the end result in points for each match, but it didn’t work.

I assumed that this was because of multicollinearity – i.e if the same teams got the same circles again and again, each circle that I added to the analysis wouldn’t explain much additional variation in the results.

But when I tested for multicollinearity, it turned out that the variables weren’t correlated (VIF < 2 for all variables). Each circle actually does explain additional variation in match results that the other circles don’t already account for.

To look at this visually, I created a correlogram, or a graph of the strength of the relationship between each of the independent variables and all of the other independent variables. If the correlations are strong, close to 1 or -1, the effects of those circles are probably too similar to each other. You can see in Figure 3 that the correlations are all fairly close to 0.

​

Figure 3: Correlogram of potential multicollinearity in the relationships between all circles in NPL Phase 2 so far. None of these independent variables appear to be correlated.

​

It was while trying to create this correlogram that I realized my problem was actually that there was a lot of missing data. I recorded circle favor in each circle for teams after they were killed as NA (not applicable), because you can’t meaningfully say that a team is favored or not favored by a circle if they’re not alive to see it.

Unfortunately, when you’re building a multiple linear regression in R, the default approach is actually to delete each observation that contains missing data, so in my case the linear model function deleted each match where a team went out before circle 9. This is called complete case analysis.

The complete case approach only left 51 observations where teams were alive until circle 9, which wasn’t enough data to draw meaningful conclusions about the effect of each circle. The usual strategy for dealing with missing data is to impute (read: guess, but fancy) what the missing data would be, but that wouldn’t be valid in this case, because there just can’t be any value of circle favor for a team that isn’t around anymore.

In order to use all the data I could, I decided to use a different strategy, called available case analysis. To sum this approach up very briefly, it uses a pairwise comparison to look at each variable only with others where there is data available. Norman Matloff, a CS professor at UC Davis, created an R package called regtools that can perform pairwise multiple regression. There are some potential issues with using this strategy if the missing values aren’t distributed completely at random, but Matloff and Xiao Gu argue in this article that they may not actually be a big deal.

​

RESULTS

​

Figure 4: The increase in total points in a match from being favored by each circle in the NPL Phase 2 so far. Non-significant circles are in gray. Colored lines represent the standard error of each estimate.

​

The eye test from the ridgeline plot pretty much holds up! This linear model is significant overall (F = 116.6221, df1 = 9, df2 = 374, p < 1x10-16).

Circles 1, 2, 7, and 8 have no significant effect.

If a team gets circle 3, they end up with 2.4 additional points in the match on average, regardless of where they are in other circles (p < 0.05).

If a team gets circle 4, it translates to 2.9 more points in the match. (p < 0.01).

Getting circle 5 is equivalent to 2.0 more points on average (p < 0.05).

Circle 6 is equivalent to 1.9 more points (p < 0.1 so near-significant, but the effect size is quite large so this is a sample size problem).

Circle 9 is worth 3.0 points on average (again near-significant, p < 0.14, just need more than 51 samples as the effect size is large). Circle 9 explains a lot of the variance in placement points that other circles don’t.

​

tl;dr NPL Royale results were all over the place. Circles 4 and 9 are worth about 3 additional points in a match. Circle 3, 5, and 6 are all worth 2 more points or less. The other circles have no significant impact on the outcome of a match.