What I have attempted to describe so far are termed ‘vanilla’ neural nets [I call them Normie Nets because a) It’s sounds funky and b) I have a cringey sense of humour — Shoutout to Mr Action Slacks]. They take a state in, they predict a state out. So what’s wrong with Normies?

Normie nets are like goldfish. They have no memory. You can use a Normie net to to predict last-pick. However if you then want to predict first pick….you need to make a whole new Normie Net and train it from scratch!

If you want a network that can accurately predict all phase picks and bans, this set-up is no good. How can it learn whether Earthshaker is usually 1st pick or 5th pick, when it just sees “Earthshaker was picked/Earthshaker wasn’t picked”.

To predict all phases we need to spice up our network. Somehow we need to let it know what order heroes are pick/banned in. People a lot cleverer than me already figured out a solid way to do this. It’s called a Recurrent Neural Network (RNN). RNN’s are neural nets you feed a sequence of inputs, rather than one at a time.

Wait, what is a Dota draft again?

Yep, it’s simply a sequence of picks and bans! This sounds like precisely what we need.

Going on a tangent about RNN’s

Recurrent neurals nets are sick! You can create AI that closely imitates Shakespear [Some dude who did a lots of plays [like films but shit] in olden times], programming languages, scientific papers, music…etc etc. I originally learnt about them from the amazing examples in http://karpathy.github.io/2015/05/21/rnn-effectiveness/

After reading this I realised that a Dota draft is very similar to text generation. Dota drafting is like a language where every word has a length of 20 letters[picks and bans], written from 113 different possible letters [heroes].

This blog post https://chunml.github.io/ChunML.github.io/project/Creating-Text-Generator-Using-Recurrent-Neural-Network/ was immensely helpful for implementing recurrent neural nets in python/keras.

Choosing a model architecture

The model was given very little constraints. For example, I did not tell the model that it should not try and pick a hero that had already been picked or banned. I wanted to see if the model could learn the structure of Dota 2 drafting by itself.

Initially this seems fairly trivial, however that is because as humans we see a drafting screen with helpful indicators, as well as experience playing the game ourself and being blocked from picking a banned hero.

All the neural network sees are sequences of 20 numbers, from 1–113. If you were just given cryptic long lists of such a wide range of numbers…would you be able to spot no number can appear twice in a sequence?

Eventually yes, but it’d take you a while. This may just be my opinion, but when it comes to machine learning, humans still learn way faster than machines on the same amount of data (We have 100 billion neurons in our brain. My recurrent neural net has several hundred. Humans are slightly imba tbh.)

Where machines get the upper-hand, is that they can process thousands or millions of examples in a second. Something that would take a human months or years. Therefore eventually the net could learn to never try and pick a banned hero, but it relies on two conditions:

1) We have a sufficiently complex network to learn it

2) We have enough matches as examples to work this out

For 1), what do we mean by complex network?

Rather than the simple Input -> Neurons -> Outputs of Fig. 2; it’s possible to have multiple layers of neurons

Input -> Neurons -> Neurons -> Neurons -> Outputs

The input data is piped into one layer of neurons, which do their calculations before sending it into another layer of neurons. We can go through as many layers of neurons as we want, before finally deciding to make a prediction in the output layer.

Having multiple layers is where the ‘deep’ in deep learning comes from. The more layers our network has, the more deep it is.

Therefore to imitate the complex behaviour and set of possibilities in Dota drafting, I used 3 separate layers of neurons.

I Read that it’s a good idea to put plenty of images in blogs to break up the text, rendering the material more easily consumable by the viewer. Please continue to consume this blog post after having glanced at this picture and read this caption.

Issue 2) is what led to my decision to include the ‘Amateur’ recorded DotA 2 team games, to bring match count total to nearly 90,000.

How many neurons to put in each layer?

There’s not really a right or wrong answer here. Most initial machine learning architecture design seems to come down to gut feeling or experience.

It seems there is no single way to determine your correct network architecture, you just try as many different ones as possible. Whichever model comes out with highest validation accuracy, that’s the model you go for.

[This concept also applies to how many layers of neurons to use. Try different depths, until you find a good compromise between accuracy vs. (training speed + overfitting)]

Overfitting: Your model complexity is too big compared to size of training data. This results in the model just blindly copying drafts it has seen before, pick for pick; rather than properly learning concepts that can be used to make sensible picks.

What’s Validation accuracy?

Machine learning has 3 different types of accuracy: Training accuracy: How accurate the model is on the data being used to train the model. If you think about it, after enough training this value always becomes high. Imagine seeing an OG vs EG draft, using a time-machine to go back 5 minutes. And then predicting the draft you’ve already seen. Nobody is impressed by this [The prediction part, people would be impressed by the time machine I think]. If someone tells you they have 99% training accuracy and expect you to be impressed you can tell them to jog on. Validation accuracy: Part of the data we gather is held back, not used for training, but used for validation. The model doesn’t learn on validation data. It just tests itself against it after the training is complete. This gives us an idea of true accuracy of the model when it comes to making predictions on new matches Test accuracy: Similar to validation accuracy. The model doesn’t learn on this either. However the validation data came from the same time-frame of matches as the training data did. The DotA meta constantly shifts over time. If a new patch hits, our model will be out-dated and our validation accuracy will be far above the True accuracy. This Test accuracy is the True accuracy. We can use future matches as our Test data, to calculate the True accuracy of the model

But how do you know which architectures have enough realistic potential to be worth testing? Fuck knows. I think with experience deep-learning professionals develop decent ‘gut-feelings’. As a noob I don’t have those, so I just randomly started with several hundred neurons in each lawyer because ¯\_(ツ)_/¯.

So that’s what I’ve been doing for the past few weeks. The final example you see in my code, may be really inefficient compared to other possible setups. However it’s the most accurate from all those I tested.

Enough fancy machine learning bullshit…Results!

Here is a nice log summary of the recurrent neural network.

This run was not the most accurate setup achieved but was close. Best runs scored high 11% validation accuracies, maintaining them for a decent duration.

Epoch number 1

Training Accuracy: 6.91% - Validation Accuracy: 8.29%



Pick:

Ogre Magi, Keeper of the Light, Bristleback, Phantom Assassin, Lina VS

Jakiro, Drow Ranger, Vengeful Spirit, Viper, Venomancer



Ban:

Morphling, Invoker, Viper, Vengeful Spirit, Nature's Prophet

VS

Tinker, Keeper of the Light, Sniper, Sniper, Sniper



Last pick accuracy: 8.2 %



--------------------------------------------------------------------------------------------------------------------------------------



Epoch number 2

Accuracy: 9.91% - Validation Accuracy: 9.71%



Pick:

Earthshaker, Shadow Shaman, Clockwerk, Puck, Anti-Mage

VS

Night Stalker, Ancient Apparition, Sand King, Invoker, Phantom Lancer



Ban:

Nature's Prophet, Venomancer, Faceless Void, Sven, Anti-Mage

VS

Necrophos, Lich, Viper, Bloodseeker, Anti-Mage



Last pick accuracy: 9.1 %



--------------------------------------------------------------------------------------------------------------------------------------



Epoch number 3

Training Accuracy: 10.86% - Validation Accuracy: 10.18%



Pick:

Lich, Clockwerk, Legion Commander, Troll Warlord, Queen of Pain

VS

Earthshaker, Shadow Shaman, Clockwerk, Ursa, Tinker



Ban:

Oracle, Venomancer, Nature's Prophet, Bloodseeker, Invoker

VS

Necrophos, Night Stalker, Viper, Faceless Void, Queen of Pain



Last pick accuracy: 9.5 %



--------------------------------------------------------------------------------------------------------------------------------------



Epoch number 4

Training Accuracy: 11.53% - Validation Accuracy: 10.76%



Pick:

Disruptor, Ogre Magi, Slardar, Lifestealer, Shadow Fiend

VS

Jakiro, Vengeful Spirit, Weaver, Viper, Undying



Ban:

Storm Spirit, Invoker, Drow Ranger, Luna, Troll Warlord

VS

Tinker, Bristleback, Sniper, Zeus, Lina



Last pick accuracy: 9.7 %



--------------------------------------------------------------------------------------------------------------------------------------



Epoch number 5

Training Accuracy: 12.09% - Validation Accuracy: 11.23%



Pick:

Ogre Magi, Disruptor, Magnus, Juggernaut, Storm Spirit

VS

Jakiro, Vengeful Spirit, Weaver, Viper, Witch Doctor



Ban:

Slark, Invoker, Drow Ranger, Luna, Outworld Devourer

VS

Tinker, Bristleback, Sniper, Zeus, Queen of Pain



Last pick accuracy: 10.7 %



--------------------------------------------------------------------------------------------------------------------------------------



Epoch number 6

Training Accuracy: 12.58% - Validation Accuracy: 10.95%



Pick:

Jakiro, Slardar, Dazzle, Viper, Huskar

VS

Undying, Weaver, Witch Doctor, Necrophos, Nature's Prophet



Ban:

Abaddon, Bristleback, Drow Ranger, Vengeful Spirit, Troll Warlord

VS

Tinker, Sniper, Keeper of the Light, Zeus, Ursa



Last pick accuracy: 10.2 % *Note 1st banned hero is simply a randomly selected hero. Not a 'predicted' hero. Every hero in the draft thereafter is a prediction based on heroes that came before it.

You can see that in first training epoch,

Training Epoch: When I said earlier we loop through every draft updating the weights as we go. Often one loop through our data isn’t enough time to update the connection weights to their optimal values [optimal values are just whatever weights produce highest accuracy]. Therefore we do multiple loops through our training data. Each loop is called an ‘epoch’. It got it’s name from machine-learning researchers being insane; finding it necessary to create a brand new fancy lord-of-the-rings sounding term for every single little fucking concept they use.

we have quite a few duplicates. Sniper gets banned 3 times. Viper duplicated. Venge duplicated.

This might not be instantly obvious, because it manages to learn so much in the first epoch!

However if you lower the learning rate, the first epoch comes out like:

Pick:

Earthshaker, Keeper of the Light, Io, Necrophos, Lich

VS

Earthshaker, Earthshaker, Earthshaker, Faceless Void, Faceless Void



Ban:

Tinker, Tinker, Faceless Void, Faceless Void, Juggernaut

VS

Sven, Invoker, Tinker, Invoker, Invoker

Noticeably poorer drafting.

We cannot let it run for ever, to a state where it almost never duplicates, because as seen in the final epoch log line; our validation accuracy has actually started going down! We are now starting to overfit our data and further training is making our model worse!

Summary

Produces realistic looking drafts

Produces more sensible drafts than 99% of pub players

11% validation accuracy

11% kind of sucks in comparison to Winter. Humanity is safe*

Model does not quite fully learn it can never pick banned heroes. It does get close, genuinely trying not to pick duplicates

*I am making the assumption analysts have considerably greater than 11% accuracy, based on my gut feeling from watching a lot of DotA 2 tournaments. Human draft prediction accuracy is hard to measure definitively, because analysts are more likely to put forward pick suggestions, when they are more confident in them.