As part of my migration to Dotaholic, I am moving some older work that I carried out looking at different aspects of Dota2 from a statistical point of view. Here is some work I did examining the efficiency of different hero combinations - since this work was done using data taken in May, some of the newer heroes are missing!Pair-wise Combinations.I'll begin by describing some of the analysis that I carried out examining which hero combinations do well together, and which hero combinations do poorly against each other.This analysis is based on a subset of around 1300000 games. It will soon be updated to include all the games in the database, and we'll repost all the results on Dotaholic.com - as soon as it's posted, it will be linked here.For all heroes currently (as of 06/05/2012) in Dota2, we calculated their average win rate in combination with all other heroes, and their average win-rate win rate versus all other heroes. Given the high number of heroes in the game, the number of possible pairs and match ups is very high$$ {n \choose k} $$.While certain hero pairs do particularly well, for example Lich together with Ursa, this is hardly surprising. Both \Lich and Ursa have a very high individual win-rate, and the fact that they win a lot of games together is not surprising.I therefore calculated an expected win ratio, given the individual win rates of both heroes, as such:\[ExpectedWinRate(a, b) = \frac{\sum\limits_{i}^{i!=b} W_{a,i} + \sum\limits_{j}^{j!=a} W_{b,j}}{\sum\limits_{i}^{i!=b} G_{a,i} + \sum\limits_{j}^{j!=a} G_{b,j}}\]Where: $$W_{a,i}$$ are the number of wins of hero a with hero i, and $$G_{a,i}$$ are the total number of games of hero a with hero i. In practice, this is the average of the win rate of hero a with all heroes except hero b, with the win rate of hero b with all other heroes except a, weighted by the total number of games.It's important to note that this expected value is quite conservative - according to this measure, no hero pair will ever do better than the best individual hero, and no hero pair will ever do worse than the worst individual hero. This measure however, is quite strongly correlated with actual performance:We then compared the ExpectedWinRate for a given hero pair with the observed win rate in our data set, using a Chi Squared Test . Put plainly, this is a statistical test used to test the probability that the observed frequency in a sample deviates significantly from the expected frequency. For an intuitive understanding, imagine that you are trying to determine if a head is rigged or not. If you expect a non-rigged coin to obtain heads 50% of the time, how sure are you that a coin is rigged if you observe 430 heads out of 1000 tosses? (quite sure it turns out).We can observe that certain heroes perform significantly better than we'd expect. For example – Skeleton King has a relatively average performance, but when paired with Ursa, his win rate approaches 70%. This is rather intuitive - in pub games, Skeleton King and Ursa can lead to a level 1 Roshan, giving your team a massive advantage. For each hero, we can observe their win-rate, their best team-mates, and their worst team-mates.Next, we examined the match ups of one hero versus another. To do this, we used a different formula to calculate the expected win-rate in a head to head match up:\[ExpectedWinRate(a, b) = \frac{\sum\limits_{i}^{i!=b} W_{a,i} + \sum\limits_{j}^{j!=a} L_{b,j}}{\sum\limits_{i}^{i!=b} G_{a,i} + \sum\limits_{j}^{j!=a} G_{b,j}}\]In this case, we are averaging out the win rate of hero a versus all other heroes, versus the loss rate of hero b versus all other heroes.Again, this quantity is highly correlated with the actual observed performance although the predicted performance again has a narrow range than the observed performance.We will continue to look at more and more stats – expect to see:– The cumulative win rate of hero pairs over time. Which hero pairs are favored in the end game?– Which heroes benefit the most from being on the dire/radiant?– An initial look at a small number of hero triplets.– Bayesian approaches to predict win rate purely as a function of heroes. For the more mathematically inclined – I checked to see if, given a win/loss, the presence of different hero in a team was conditionally independent ( (P (a | W,b) == P(a | W) ) and this didn't turn out to be the case at all unfortunately.From this initial look at the data, we can conclude that:– The win rate of two given heroes is strongly correlated with individual win rate of the given heroes. Heroes that tend to do poorly rarely have super strong synergy with other heroes that drastically change their win rate.– The strongest performing hero pair is Ursa and Skeleton King. This is most likely due to the ability to do Roshan at level 1, and to the fact that Ursa is by far the single hero with the highest individual performance.– Alchemist does very poorly with more or less every single hero.