replaydb <- data.frame(hero1 = character(100000),

hero2 = character(100000),

hero3 = character(100000),

hero4 = character(100000),

hero5 = character(100000),

hero6 = character(100000),

hero7 = character(100000),

hero8 = character(100000),

hero9 = character(100000),

hero0 = character(100000),

stringsAsFactors = FALSE)

i <- 1

while (!is.na(match("",replaydb$hero1))) {

path <- paste0(path <- "api/v1/replays/paged?page=",

i,

"&game_type=QuickMatch&with_players=true&start_date=2017-05-01")

rawresult <- GET(url="http://hotsapi.net", path = path)

content <- fromJSON(rawToChar(rawresult$content))

for(j in 1:nrow(content$replays)){

if(sum(content$replays$players[[j]]$hero_level > 4) == 10){

k <- match("",replaydb$hero1)

filter <- content$replays$players[[j]]$winner == "TRUE"

win_team <- content$replays$players[[j]]$hero$name[filter]

lose_team <- content$replays$players[[j]]$hero$name[!filter]

vec <- c(win_team,lose_team)

replaydb[k,] <- vec

}

}

i <- i+1

Sys.sleep(0.2)

}

content$replays

i

replaydb

Abathur 2 Alarak 4 Alexstrasza 0 Ana 0 Anub'arak 9 Artanis 8 Arthas 9 Auriel 0 Azmodan 1 Blaze 9 Brightwing 0 Cassia 1 Chen 7 Cho 9 Chromie 0 D.Va 9 Deckard 3 Dehaka 9 Diablo 9 E.T.C. 9 Falstad 0 Fenix 1 Gall 0 Garrosh 9 Gazlowe 3 Genji 0 Greymane 2 Gul'dan 0 Hanzo 0 Illidan 4 Jaina 0 Johanna 9 Junkrat 0 Kael'thas 0 Kel'Thuzad 0 Kerrigan 5 Kharazim 2 Leoric 8 Li Li 3 Li-Ming 0 Lt. Morales 0 Lúcio 0 Lunara 0 Maiev 4 Malfurion 0 Malthael 4 Medivh 1 Muradin 9 Murky 2 Nazeebo 1 Nova 0 Probius 0 Ragnaros 5 Raynor 0 Rehgar 3 Rexxar 7 Samuro 3 Sgt. Hammer 3 Sonya 7 Stitches 8 Stukov 0 Sylvanas 0 Tassadar 0 The Butcher 5 The Lost Vikings 0 Thrall 5 Tracer 0 Tychus 1 Tyrael 9 Tyrande 0 Uther 4 Valeera 4 Valla 0 Varian 9 Xul 3 Yrel 9 Zagara 0 Zarya 7 Zeratul 0 Zul'jin 1

herodata <- read.csv('herodata.csv',header=FALSE)



frontline1 <- numeric(nrow(replaydb))

frontline2 <- numeric(nrow(replaydb))

sub <- numeric(5)

sub2 <- numeric(5)

for(i in 1:nrow(replaydb)) {

sub <- herodata[match(replaydb[i,1:5],herodata[,1]),2]

sub2 <- herodata[match(replaydb[i,6:10],herodata[,1]),2]

frontline1[i] <- sum(sub)

frontline2[i] <- sum(sub2)

}



replaydb$frontline1 <- frontline1

replaydb$frontline2 <- frontline2

replaydb$frontlinediff <- frontline1 - frontline2

frontlinedata <- as.data.frame(table(replaydb$frontlinediff))

j <- 1



frontlinewrs <- data.frame(frontlinediff = numeric(35),wr = numeric(35))



for(i in 1:30) {

wins <- frontlinedata$Freq[match(i,frontlinedata$Var1)]

losses <- frontlinedata$Freq[match(-i,frontlinedata$Var1)]

if(!is.na(wins) & !is.na(losses) & (wins + losses) > 50) {

frontlinewrs[j,1] <- i

frontlinewrs[j,2] <- wins / (wins + losses)

j <- j+1

}

}

Here was the result:

frontlinediff wr 1 1 0.5044570 2 2 0.5163814 3 3 0.5089774 4 4 0.5169750 5 5 0.5207921 6 6 0.5275182 7 7 0.5259750 8 8 0.5164336 9 9 0.5375276 10 10 0.5249141 11 11 0.5490196 12 12 0.5308034 13 13 0.5595476 14 14 0.5371981 15 15 0.5429403 16 16 0.5882353 17 17 0.6186186 18 18 0.5418502 19 19 0.5563380 20 20 0.5887850 21 21 0.5625000

mages <- c("Li-Ming","Chromie","Kael'thas","Kel'thuzad","Jaina")

magecount <- function(x) {sum(x %in% mages)}

mage1 <- apply(replaydb[,1:5],1,FUN=magecount)

mage2 <- apply(replaydb[,6:10],1,FUN=magecount)

replaydb$mage1 <- mage1

replaydb$mage2 <- mage2



generalmagedata <- data.frame(mage1 = numeric(36), mage2 = numeric(36), gamecount = numeric(36))



for(i in 0:5) {

for(j in 0:5) {

filter <- replaydb$mage1 == i & replaydb$mage2 == j

generalmagedata[(i)*6+j+1,] <- c(i,j,nrow(replaydb[filter,]))

}

}

generalmagedata

mage1 mage2 gamecount 1 0 0 39221 2 0 1 22630 3 0 2 2797 4 0 3 86 5 0 4 2 6 0 5 0 7 1 0 19637 8 1 1 11130 9 1 2 1386 10 1 3 46 11 1 4 1 12 1 5 0 13 2 0 1837 14 2 1 1026 15 2 2 126 16 2 3 8 17 2 4 0 18 2 5 0 19 3 0 38 20 3 1 22 21 3 2 5 22 3 3 0 23 3 4 0 24 3 5 0 25 4 0 1 26 4 1 1 27 4 2 0 28 4 3 0 29 4 4 0 30 4 5 0 31 5 0 0 32 5 1 0 33 5 2 0 34 5 3 0 35 5 4 0 36 5 5 0

0 vs 1: 53.5%

0 vs 2: 60.4%

0 vs 3: 69.4%

1 vs 2: 57.5%

1 vs 3: 67.6%

replaydb

As a budding data scientist (read: i downloaded RStudio and learned some syntax) I decided to play around with the data from HotSAPI and see if there were any interesting results that I could pull out.One of the hot topics over the past few months have been team compositions in quickmatch. No doubt the current situation is a disaster, since as a reasonably frequent QM player myself I frequently feel that some games require extraordinary skill differences to overcome difficulties presented by the compositions.One frequent situation that comes up is that one team is just far lower in total healthpools than another; or one team is filled with vulnerable mages that require frontlines to allow them the space to deal damage, while the other team has self-sustain melee assassins with damage mitigation. The latter team frequently finds themselves more able to gain ground in teamfights over objectives because mages simply don't have the staying power in a prolonged fight, even if they get some good spells in.I decided to investigate whether these feelings held ground in actual match statistics. I harvested 100,000 replays from HotSAPI of quickmatches after the 1st of May, 2017 (no particular reason for the date, but for some reason I can't get any results when searching after December 2017; have absolutely no QM replays been uploaded past that date?), with all players having hero level at least 5, using the following code:The code didn't run smoothly; the HotSAPI server had some hiccups over the hours that it took to harvest this data and I would get an error becausewould be empty. When that happened, I incrementedby 1 and re-ran the while loop to avoid any repetitions. Of course, unfortunately this means that my results wouldn't be fully reproducible, but I hope that anyone trying this out themselves would get within 1% of the collection of replays I got.This code would produce, a dataframe where each row contained 10 hero names; the first 5 are the heroes on the winning team and the last 5 being the heroes on the losing team.The next step here would be to assign "frontlineness" values to each hero, to give an idea of how much staying power they afforded their team. Obviously this isn't an exact science and people will disagree with my exact values, but I used a scale of 0 to 9 and the guiding principles were that all full-tanks get a 9, melee assassins with healthy amounts of self-sustain are anywhere between 4 to 7 and healers are 0 to 2 depending on my personal experience with the heroes. I adjusted up and down depending on how easily they were harassed out, how much pressure and threat they apply (i.e. damage) and their vulnerability to burst.This was what I arrived at:I worked this all out in an excel file and saved it as a csv to be read in as an object into R, then calculated the "total frontlines" of each team in each game and appended it to the dataframe.The last step was the numbercrunching on the winrates of the games depending on the differences in frontline:A subtle but definite trend that shows that a significant difference in the frontline ability between quickmatch teams is associated with a higher winrate. The trend isn't strictly increasing, and it's more than likely that my exact assessments of the frontline ability aren't fully accurate, but it's better than no trend at all. It's also notable that at no point is there a sub-50% winrate; no positive difference in frontline ability results in a lower-than-average winrate.I think that there is a case to be made here to stop lumping Butcher, Thrall and Kerrigan in the same "dps hero" pool as Li-Ming, Valla or Chromie. I've played my fair share of melee assassins and bruisers and mages, and there's a very definite feeling of who's in charge of the teamfight if the quickmatch teamcomps become as extreme as having a difference in "frontlineness" by 10 or more.As a small follow-up to this, I decided to count the number of mages in each team instead. Mages tend to get pushed around a lot in teamfights. They are cooldown-dependent and hence there are significant periods of the game where they are extremely helpless and pose no threat because their cooldowns are up. They have no staying power over control-point objectives and are dependent on a tank for them to be able to kill people outside of a burst flank (which requires personal skill and the enemy team to be unaware).I ran the following code to analyse the mage situation.This was theobject I got:And, after numbercrunching, we get the following winrates:I used 50 as a cut-off point for number of games to consider significant and ignored all other matchups.This time, the differences in winrates are much more pronounced. Having 3 mages on a team going up against 0 mages is a significant disadvantage that would have been mitigated if one mage was simply swapped over to the other team; a similar argument could be made for 0 vs 2. Skill differences and MMR differences are obviously another criterion to consider which is why sometimes these mage-imbalance matchups occasionally occur, but I think that the numbers here are significant enough to consider a mage-balancing matchmaking rule for quickmatch, to never have the number of mages differ by 2 or more.(On a side note: if someone would like to play around with this 100,000 quickmatch team composition dataset, I havesaved onto an RDS file, available here: https://www.dropbox.com/s/yltchd599syl84o/replaydb.rds?dl=0