Team performance

We surmise that the player performance can be extended to the team level by calculating the average performance of a subset of players (1)where . We further assume that performance differences between teams, which we define as (2)will provide an indicator of which team “deserved” victory in a match (Fig. 2A). In order to test these hypotheses, we first obtain the distribution of differences in performance conditional on outcome (3)where “Win”, “Loss”, “Not Win” . Figure 2 shows the cumulative distributions of for these three outcomes (see Fig. 3 for a justification for this choice). It is visually apparent that there is a substantially larger mean for the cases where the team with the highest performance wins the match.

PPT PowerPoint slide

PowerPoint slide PNG larger image

larger image TIFF original image Download: Figure 2. Validity of the flow centrality metric. We define team performance as the mean normalized log flow centrality of the top players in a team. (A) Cumulative distribution of for matches where the team with highest performance wins, loses, or “not wins”. Clearly, the mean is much larger for games in which team with the highest performance wins. We use Monte Carlo methods with boostrapping to determine the significance of the differences in means for the different match outcomes. The red lines indicate the observed difference in whereas the blue curves are the distribution of measured differences for the null hypothesis. (B) We find that there is no statistically significant difference in when comparing “Loss” versus “Not Win” outcomes. In contrast, we find highly significant differences when comparing (C) “Win” versus “Loss” or (D) “Win” versus “Not Win”. https://doi.org/10.1371/journal.pone.0010937.g002

PPT PowerPoint slide

PowerPoint slide PNG larger image

larger image TIFF original image Download: Figure 3. Sensitivity and specificity of the flow centrality metric. (A) For every distinct value of in our data, we calculate the fraction of values of in the groups “Win” and “Not Win”. The area under the curve (AUC) statistic provides a measure of the sensitivity-specificity of the quantity under consideration [12]. Values of AUC close to 1 indicate high sensitivity with high specificity. We find an AUC of 0.825, much larger than the values expect by chance at the 90% confidence interval (shown in gray), which vary between 0.319 and 0.652. (B) Number of matches where the team with highest performance wins, ties, or loses as a function of . For the 20 matches where the difference is greater than 0.75, the team with the highest performance won 15 times, tied 2 and lost 3. This means that for the odds of the team of highest performance winning the match are 3∶1. (C) AUC statistic as a function of in for “win” versus “Loss” outcomes. The highest AUC value is achieved for . https://doi.org/10.1371/journal.pone.0010937.g003

We define as (4)To test the significance of the values of obtained, we use bootstrap hypothesis testing [12]. Specifically, we pool the values of from all 30 matches in the tournament. We then draw surrogate random samples with replacement from the pooled data. For instance, for the case in Fig. 2B we draw surrogate “Loss” and “Not Win” samples with 9 and 14 data points, respectively, and then determine the difference in means of the two surrogate samples. We repeat this procedure 50,000 times in order to determine the significance of the observed . As shown in Figs. 2B, C, and D, we find that there is no significant difference in mean between “Loss” and “Not Win” outcomes, while the values of and are highly significant ( ).

The fact that is significantly different for matches in which the team that wins has a better performance, suggests that the value of is correlated with the outcome of a match and thus can be used as an objective measure of performance. We thus use the area under the curve (AUC)—sometimes also called the receiver-operator curve (ROC) or the sensitivity-specificity curve—statistic in order to quantify the sensitivity and specificity of . Figure 3A shows the AUC for the outcomes “Win” versus “Not Win.” We obtain an AUC of 0.825, which is far outside the 90% confidence band for random samples [0.319, 0.653]. We find that the best AUC value is found when team performance is defined as the average performance of the top two players in a team, although an average of the top 1 to 4 players would also lead to significant discrimination (Fig. 3B).

The AUC analysis enables us to conclude that when , the odds that the team with higher performance wins the match are 3∶1 (Fig. 3C). Our team performance metric supports the general consensus that Spain, the winner of Euro 2008, played extremely well during the entire tournament (Table 1 and Fig. 4).