imp42 Profile Blog Joined November 2010 398 Posts #1 Towards a good StarCraft bot - Part 4 - "Planning (1/3)"



Summary:

+ Show Spoiler + Instead of trying to plan ahead in the full StarCraft game, we design a mini-game where planning ahead is easy. The complexity of the game is thereby moved down one layer of abstraction and divided into the two separate problems of finding a good translation between the real game state and the mini-game state, as well as finding good rules for the mini-game.



One of the problems with “solving StarCraft” is the fact that there are thousands of possible moves at any given time and there is only little time to decide what move to pick. So approaches that have been used successfully in Chess and Go cannot be transferred directly to StarCraft.

Either we have to come up with novel approaches to search enormous solution trees or we have to invent a new game with less moves!



What if StarCraft consisted of only four possible moves?



Let’s define a new game, with only four possible moves as follows:



a) Increase own economy

b) Increase own army

c) Decrease enemy army

d) Decrease enemy economy



Ok, that sounds very simple. But what are the rules of the game?



1) The game is turn-based. Each player gets to pick one action per turn, then a payment is made to both players.

2) The payment adds X to the player’s bank.

3) Action a) costs 1 unit of money and increases X by 1 for the acting player (X: income per turn)

4) Action b) costs 1 unit of money to increase army size by 1

5) Action c) The smaller army A1 is set to 0. The larger army A2 is set to SQRT(A2^2 – A1^2) (i.e. the square root of the difference of the squares of the army sizes)

6) Action d) can only be performed if army size > 0

---a. If enemy army = 0: decreases X by 2 for the enemy player

---b. If enemy army > 0: decreases X by 1 for the enemy player, costs 1 unit of army

7) A player wins the game if the enemy’s army and future economy is 0



That’s it. Now we have a new game with 7 simple rules and 4 simple moves. It sure sounds a lot more manageable for an automated solver.



The game state is given as [B1, A1, I1, B2, A2, I2] with the B’s being player’s bank, A’s being army sizes and the I’s being income per turn.



How would such a game play out?

Let’s consider the four moves of player 1 and the four answers by player 2 for each. It turns out that move D is illegal and move C does not make any sense, because there is no army to fight with. Hence, only actions A and B remain for the first turn.





After both players have completed their move, the payment is made. So if player 1 invests in army (move B) and player 2 invests in economy (move A), the result after the first move will be [1,1,1,2,0,2]

A first game state evaluation shows two symmetrical outcomes and two symmetrical outcomes.



Let’s look at the next turn! Now it already gets very interesting, because all four moves are possible.

For player 1:





The tree for the reactions of player 2 to the second move of player 1 gets quite big. For reference:

+ Show Spoiler +



We can now perform a second game state evaluation.



- Let us call a move “strictly inferior” for a player if the resulting game state is dominated by the other player.

- Let us also refer to a game state as “dominated by player 1” if B1+A1+I1 > B2+A2+I2 and B1>=B2, A1>=A2, I1>=I2 (i.e. he has more than the other player in total, and not less on any single variable).



It turns out that the resulting tree is not that big after all, because quite a few branches can be pruned. Grey states indicate an equal game; orange states show losing moves for player 1, yellow states show losing moves for player 2. This leaves us with only 5 states to examine further (the 5 white ones at the bottom level).



Remember: these are only the first two moves. It will be interesting to see how e.g. a greedy player (choosing action A too often) will be punished. The above process can be repeated for any number of moves, rowing the tree to further levels.

I have implemented a small script in Prolog, which is able to play this game by performing the mini-max algorithm on the search tree.



Discussion

The question whether this approach is any good when playing real StarCraft depends on two factors:



Factor 1: How well can a real game state be translated to such a simplified form?

For example, what value should we assign to A1, if player 1 has 5 marines and two siege tanks?



Factor 2: How well do the rules of the mini-game reflect the real rules of StarCraft?

The quadratic function of rule c) approximates marines-only battles quite well, but it is no good when playing with units that counter each other. Another quite obvious deficiency is the lack of representation of parallel production.



Still, this extremely reduced mini-game already shows basic concepts of “macro”, “rush”, “harass”, “low-eco games”, “high-eco games”, etc.

Outlook: The next steps involve some refinement of the rules and then learning a good translation from real game state to simplified game state. Such learning could be done by various means, neural networks being one of them.



Disclaimer

Yes, the mini-game shown here oversimplifies hugely. It is just a starting point. The goal is to design a mini-game that represents the necessary (and only the necessary!) information to successfully consult a bot in what higher-level plan to follow.



the graphs shown were created manually. A quick cross-check with the implemented script shows they are correct. Nevertheless, there could be typos.



Credits

Credits go to like-a-boss and the people at IRC ##prolog for their help with the Prolog script.















Summary:One of the problems with “solving StarCraft” is the fact that there are thousands of possible moves at any given time and there is only little time to decide what move to pick. So approaches that have been used successfully in Chess and Go cannot be transferred directly to StarCraft.Either we have to come up with novel approaches to search enormous solution trees or we have to invent a new game with less moves!Let’s define a new game, with only four possible moves as follows:a) Increase own economyb) Increase own armyc) Decrease enemy armyd) Decrease enemy economyOk, that sounds very simple. But what are the rules of the game?1) The game is turn-based. Each player gets to pick one action per turn, then a payment is made to both players.2) The payment adds X to the player’s bank.3) Action a) costs 1 unit of money and increases X by 1 for the acting player (X: income per turn)4) Action b) costs 1 unit of money to increase army size by 15) Action c) The smaller army A1 is set to 0. The larger army A2 is set to SQRT(A2^2 – A1^2) (i.e. the square root of the difference of the squares of the army sizes)6) Action d) can only be performed if army size > 0---a. If enemy army = 0: decreases X by 2 for the enemy player---b. If enemy army > 0: decreases X by 1 for the enemy player, costs 1 unit of army7) A player wins the game if the enemy’s army and future economy is 0That’s it. Now we have a new game with 7 simple rules and 4 simple moves. It sure sounds a lot more manageable for an automated solver.The game state is given as [B1, A1, I1, B2, A2, I2] with the B’s being player’s bank, A’s being army sizes and the I’s being income per turn.Let’s consider the four moves of player 1 and the four answers by player 2 for each. It turns out that move D is illegal and move C does not make any sense, because there is no army to fight with. Hence, only actions A and B remain for the first turn.After both players have completed their move, the payment is made. So if player 1 invests in army (move B) and player 2 invests in economy (move A), the result after the first move will be [1,1,1,2,0,2]A first game state evaluation shows two symmetrical outcomes and two symmetrical outcomes.Let’s look at the next turn! Now it already gets very interesting, because all four moves are possible.For player 1:The tree for the reactions of player 2 to the second move of player 1 gets quite big. For reference:We can now perform a second game state evaluation.- Let us call a move “strictly inferior” for a player if the resulting game state is dominated by the other player.- Let us also refer to a game state as “dominated by player 1” if B1+A1+I1 > B2+A2+I2 and B1>=B2, A1>=A2, I1>=I2 (i.e. he has more than the other player in total, and not less on any single variable).It turns out that the resulting tree is not that big after all, because quite a few branches can be pruned. Grey states indicate an equal game; orange states show losing moves for player 1, yellow states show losing moves for player 2. This leaves us with only 5 states to examine further (the 5 white ones at the bottom level).Remember: these are only the first two moves. It will be interesting to see how e.g. a greedy player (choosing action A too often) will be punished. The above process can be repeated for any number of moves, rowing the tree to further levels.I have implemented a small script in Prolog, which is able to play this game by performing the mini-max algorithm on the search tree.The question whether this approach is any good when playing real StarCraft depends on two factors:Factor 1: How well can a real game state be translated to such a simplified form?For example, what value should we assign to A1, if player 1 has 5 marines and two siege tanks?Factor 2: How well do the rules of the mini-game reflect the real rules of StarCraft?The quadratic function of rule c) approximates marines-only battles quite well, but it is no good when playing with units that counter each other. Another quite obvious deficiency is the lack of representation of parallel production.Still, this extremely reduced mini-game already shows basic concepts of “macro”, “rush”, “harass”, “low-eco games”, “high-eco games”, etc.Outlook: The next steps involve some refinement of the rules and then learning a good translation from real game state to simplified game state. Such learning could be done by various means, neural networks being one of them.Yes, the mini-game shown here oversimplifies hugely. It is just a starting point. The goal is to design a mini-game that represents the necessary (and only the necessary!) information to successfully consult a bot in what higher-level plan to follow.the graphs shown were created manually. A quick cross-check with the implemented script shows they are correct. Nevertheless, there could be typos.Credits go to like-a-boss and the people at IRC ##prolog for their help with the Prolog script. 50 pts Copper League

Jett.Jack.Alvir Profile Blog Joined August 2011 Canada 2250 Posts #2 Again this is incredibly fascinating stuff. Your process in developing the bot really simplifies the game, yet unearths a lot of nuances when you translate it back to the actual game.



So your mini-game doesn't take into account increasing tech. If you tried to do that, how would it affect your tree? Considering you can't 'decrease enemy tech' once it reaches completion, perhaps this decision has no positive affect until X turns later. Also, an option to 'decrease enemy tech' is available to your opponent before X turn occurs. As well, when X turn occurs and tech completes, how will this affect the army?



I also wonder how you translate all this considering fog of war. It seems in your mini-game all information is available to both players.



Essentially, you distill any game down to a few choices, which is really what SC is all about.



Do I build an army or economy? When I have an army, do I attack his army or attack his economy?

imp42 Profile Blog Joined November 2010 398 Posts Last Edited: 2016-10-24 18:47:45 #3 On October 25 2016 02:07 Jett.Jack.Alvir wrote:

Again this is incredibly fascinating stuff. Your process in developing the bot really simplifies the game, yet unearths a lot of nuances when you translate it back to the actual game.



So your mini-game doesn't take into account increasing tech. If you tried to do that, how would it affect your tree? Considering you can't 'decrease enemy tech' once it reaches completion, perhaps this decision has no positive affect until X turns later. Also, an option to 'decrease enemy tech' is available to your opponent before X turn occurs. As well, when X turn occurs and tech completes, how will this affect the army?



I see my approach as a kind of reverse-engineering of what constitutes the complexity of the game. So I start as simple as possible and then model additional features from there.



It is "relatively" easy to calculate the army size required for an upgrade to be worth it. Let's just do another wild simplification and state the cost of the infantry weapon level 1 upgrade as 200 minerals. That is 4 marines.

So I could either have the upgrades or 4 more marines. I.e. the upgrade is worth it's cost if it increases total damage (in the game!) by more than what 4 marines would produce.

The second factor that comes into play is the fact that if you decide _not_ to upgrade, this will affect you later in the game because you will always lag behind in upgrades. The rules of the game make it impossible to decide to delay the first upgrade but speed up the second one.

This leads me to believe that, for now, I can just assume both players upgrade at the same optimal time, such that they cancel each other out. Remember that any feature of the game is only relevant here, if it influences what move will be picked in the mini-game.



I see my approach as a kind of reverse-engineering of what constitutes the complexity of the game. So I start as simple as possible and then model additional features from there.It is "relatively" easy to calculate the army size required for an upgrade to be worth it. Let's just do another wild simplification and state the cost of the infantry weapon level 1 upgrade as 200 minerals. That is 4 marines.So I could either have the upgrades or 4 more marines. I.e. the upgrade is worth it's cost if it increases total damage (in the game!) by more than what 4 marines would produce.The second factor that comes into play is the fact that if you decide _not_ to upgrade, this will affect you later in the game because you will always lag behind in upgrades. The rules of the game make it impossible to decide to delay the first upgrade but speed up the second one.This leads me to believe that, for now, I can just assume both players upgrade at the same optimal time, such that they cancel each other out. Remember that any feature of the game is only relevant here, if it influences what move will be picked in the mini-game.

I also wonder how you translate all this considering fog of war. It seems in your mini-game all information is available to both players.

showcased how you can calculate the production possibilities frontier of the opponent. That is, calculate what he could possibly have at a given moment in the game. Since I wrote that post I have increased the accuracy of the mineral income prediction to roughly +/- 100 minerals for the first 20'000 frames (that is 13 minutes into the game). This prediction accounts for less-economical builds in that it considers the loss of future income as a consequence of having an army earlier.

Of course, with fog of war the range of possibilities keeps growing to the point the calculation becomes useless (at the 13 minute mark the opponent could have basically anything). That is where scouting comes in. It greatly reduces the range of possibilities.

What you do next (which is the part I am working on right now) is to map game states in such a reduced range to simplified states, then you can calculate the appropriate answers for the different possibilities in the range.

Since it is a game of incomplete information, you will somehow have to determine which of the answers to the possible states you will pick. It may be the one that covers most possibilities, or the one that is most likely not to lose, etc.



My post on economy showcased how you can calculate the production possibilities frontier of the opponent. That is, calculate what he could possibly have at a given moment in the game. Since I wrote that post I have increased the accuracy of the mineral income prediction to roughly +/- 100 minerals for the first 20'000 frames (that is 13 minutes into the game). This prediction accounts for less-economical builds in that it considers the loss of future income as a consequence of having an army earlier.Of course, with fog of war the range of possibilities keeps growing to the point the calculation becomes useless (at the 13 minute mark the opponent could have basically anything). That is where scouting comes in. It greatly reduces the range of possibilities.What you do next (which is the part I am working on right now) is to map game states in such a reduced range to simplified states, then you can calculate the appropriate answers for the different possibilities in the range.Since it is a game of incomplete information, you will somehow have to determine which of the answers to the possible states you will pick. It may be the one that covers most possibilities, or the one that is most likely not to lose, etc.

Essentially, you distill any game down to a few choices, which is really what SC is all about.

Do I build an army or economy? When I have an army, do I attack his army or attack his economy?



yes! And I want to find out what variables are required to make such a choice. I am convinced you neither need the full game state nor is the mini-game enough. But now I am able to narrow down the answer from both sides.



PS: on a side note, I think I will soon be able to calculate a rough value of information in terms of minerals. Then I could evaluate the value and cost of scouting.

yes! And I want to find out what variables are required to make such a choice. I am convinced you neither need the full game state nor is the mini-game enough. But now I am able to narrow down the answer from both sides.PS: on a side note, I think I will soon be able to calculate a rough value of information in terms of minerals. Then I could evaluate the value and cost of scouting. 50 pts Copper League

YokoKano Profile Blog Joined July 2012 United States 607 Posts #4 lol. you should heat map the trees based on unit position. it might be possible a sight range overlord could canvass the whole map. IQ 155.905638752

imp42 Profile Blog Joined November 2010 398 Posts #5 On October 25 2016 04:49 YokoKano wrote:

lol. you should heat map the trees based on unit position. it might be possible a sight range overlord could canvass the whole map.

I didn't quite get the idea/joke.



regarding unit position, my assumption is that the positions do not matter when evaluating what move to choose.

More precisely, the assumption states that unit position is not an independent variable. Rather it can be translated into an army advantage when translating real game state into simplified game state.

I didn't quite get the idea/joke.regarding unit position, my assumption is that the positions do not matter when evaluating what move to choose.More precisely, the assumption states that unit position is not an independent variable. Rather it can be translated into an army advantage when translating real game state into simplified game state. 50 pts Copper League

YokoKano Profile Blog Joined July 2012 United States 607 Posts Last Edited: 2016-10-24 22:46:05 #6 On October 25 2016 06:03 imp42 wrote:

Show nested quote +

On October 25 2016 04:49 YokoKano wrote:

lol. you should heat map the trees based on unit position. it might be possible a sight range overlord could canvass the whole map.

I didn't quite get the idea/joke.



regarding unit position, my assumption is that the positions do not matter when evaluating what move to choose.

More precisely, the assumption states that unit position is not an independent variable. Rather it can be translated into an army advantage when translating real game state into simplified game state.

I didn't quite get the idea/joke.regarding unit position, my assumption is that the positions do not matter when evaluating what move to choose.More precisely, the assumption states that unit position is not an independent variable. Rather it can be translated into an army advantage when translating real game state into simplified game state.



it sounds like a unique solution. All the marine data will be located at the marine. In SC2 terms army value will include point value of say 12 marines, point value ranging from base to greater. The sum will be the marines' absolute value in game terms.



Could you use a related rates solution to increase army or economy each turn? Since we are standardizing both values, we can just compare the derivative and take the greater (or lesser from enemy perspective). it sounds like a unique solution. All the marine data will be located at the marine. In SC2 terms army value will include point value of say 12 marines, point value ranging from base to greater. The sum will be the marines' absolute value in game terms.Could you use a related rates solution to increase army or economy each turn? Since we are standardizing both values, we can just compare the derivative and take the greater (or lesser from enemy perspective). IQ 155.905638752

imp42 Profile Blog Joined November 2010 398 Posts Last Edited: 2016-10-25 05:25:10 #7 On October 25 2016 07:43 YokoKano wrote:

it sounds like a unique solution. All the marine data will be located at the marine. In SC2 terms army value will include point value of say 12 marines, point value ranging from base to greater. The sum will be the marines' absolute value in game terms.

Keep in mind that having the marines together is much more important, since army power is a quadratic function of army size. So you would need a higher order function representing how spread out the units are (minimizing), then a second order function representing point value from base. Because the full interaction of variables tends to get rather complex it would probably be rather difficult to extract the relevant ones by manual analysis. Instead, it seems a neural net could be a better candidate to find a good translation. This is an approach I am currently discussing at #bwapi.



Keep in mind that having the marines together is much more important, since army power is a quadratic function of army size. So you would need a higher order function representing how spread out the units are (minimizing), then a second order function representing point value from base. Because the full interaction of variables tends to get rather complex it would probably be rather difficult to extract the relevant ones by manual analysis. Instead, it seems a neural net could be a better candidate to find a good translation. This is an approach I am currently discussing at #bwapi.

Could you use a related rates solution to increase army or economy each turn? Since we are standardizing both values, we can just compare the derivative and take the greater (or lesser from enemy perspective).

Ok, I think I understand (but I'm not a 100% sure). The reason I picked a search tree and actually play out the game rather than maximizing some two-dimensional function is because the value of variables depend on each other. So the function to maximize would necessarily have to include all game state variables (so we're already talking 6-dimensional function). If I managed to find such a function allowing me to predict a move via derivation then that function would quite certainly become obsolete as soon as I change the rules of the game or add more variables to the state (while the mini-max just minimaxes over the new rules, whatever they are). Maybe it is possible to find such a function automatically, but that is currently above my league.



So yes, I think I'm sacrificing run-time efficiency for generality and your solution could work for a specific mini-game. Ok, I think I understand (but I'm not a 100% sure). The reason I picked a search tree and actually play out the game rather than maximizing some two-dimensional function is because the value of variables depend on each other. So the function to maximize would necessarily have to include all game state variables (so we're already talking 6-dimensional function). If I managed to find such a function allowing me to predict a move via derivation then that function would quite certainly become obsolete as soon as I change the rules of the game or add more variables to the state (while the mini-max just minimaxes over the new rules, whatever they are). Maybe it is possible to find such a function automatically, but that is currently above my league.So yes, I think I'm sacrificing run-time efficiency for generality and your solution could work for a specific mini-game. 50 pts Copper League

Jett.Jack.Alvir Profile Blog Joined August 2011 Canada 2250 Posts #8 I know during my SC2 games, I never commit to a skirmish unless I think I have an advantage (either positional/size/unit type) or a goal (snipe particular unit/reduce his army size/distract him)



If I can't commit to a fight, my next choice is to expand (increase own economy) or harass (decrease opponent economy). Sometimes I can do both if my opponent gives me an opportunity, but most often not.



I keep this decision tree going until I either get impatient and just go yolo in one big battle, or I make a mistake and lose.



I think the hardest part of your mini-game is deciding whether you can fight army or not. In a real game, there are so many factors to take into consideration. Attacking his economy is almost always viable, because its incredibly difficult to secure all your resources. You will almsto always find something undefended.

YokoKano Profile Blog Joined July 2012 United States 607 Posts #9 Thinking down the road aways you should make sure all the solutions are Pareto optimal. You seem to be doing this but I cannot tell if some of the solutions are only temporarily dominated. It is possible some actions will need to be examined in continuum. If the bot does not commit to behavior on turn 5 or turn 6 it may not understand what is happening at turn 1. IQ 155.905638752

nepeta Profile Blog Joined May 2008 1872 Posts #10 I'm still impressed by the way you're doing things. Do put a paper up sometime, I'm sure many coders would benefit from it. Keep up the good work! Broodwar AI :) http://sscaitournament.com http://www.starcraftai.com/wiki/Main_Page

Jett.Jack.Alvir Profile Blog Joined August 2011 Canada 2250 Posts Last Edited: 2016-11-05 05:50:43 #11 I don't understand all of what you said YokoKano, but I do understand the point.



imp42, can your bot understand if at the end of a small battle, did it gain/lose resources relative to the opponent? Or who ended the skimirsh with a small advantage?



Games I have lost was because I misread the army size. If your bot makes a mistake in examining the situation, how will it know to keep trying the same solution or move to another? Or how will it know it made any mistake for that matter?

imp42 Profile Blog Joined November 2010 398 Posts #12 On November 05 2016 14:34 Jett.Jack.Alvir wrote:

imp42, can your bot understand if at the end of a small battle, did it gain/lose resources relative to the opponent? Or who ended the skimirsh with a small advantage?

Up to now, I have taken a more calculation-based approach to the game, simply because I thought it would make sense to limit the game space mathematically as much as possible before attempting to explore such a space. For a small battle this means that the bot evaluates whether it is worth fighting or it should escape at every frame by just comparing applied and received damage. Once the battle is over it does not have any notion of how well it did. However, dead enemy units do count towards the upper limit of what the enemy could still have.



Up to now, I have taken a more calculation-based approach to the game, simply because I thought it would make sense to limit the game space mathematically as much as possible before attempting to explore such a space. For a small battle this means that the bot evaluates whether it is worth fighting or it should escape at every frame by just comparing applied and received damage. Once the battle is over ithave any notion of how well it did. However, dead enemy units do count towards the upper limit of what the enemy could still have.

Games I have lost was because I misread the army size. If your bot makes a mistake in examining the situation, how will it know to keep trying the same solution or move to another? Or how will it know it made any mistake for that matter?

At least early to mid-game the probability of misreading is reduced somewhat because I know what the opponent could possibly have. So if I just see 2 marines when he could have 10 I could be suspicious.



As of now it has no concept of "mistake". It can safely assume it will not under-estimate enemy army size as long as calculating the upper-bound is still computationally feasible (~20k frames). over-estimating can lead to missed opportunities, but is generally not really harmful (because it implies being ahead).

Currently, the only way not to repeat a bad engagement is via tracking of enemy units and their last known positions.

Certainly more elaborate evaluations are required once more than one unit type are involved.

At least early to mid-game the probability of misreading is reduced somewhat because I know what the opponent could possibly have. So if I just see 2 marines when he could have 10 I could be suspicious.As of now it has no concept of "mistake". It can safely assume it will not under-estimate enemy army size as long as calculating the upper-bound is still computationally feasible (~20k frames). over-estimating can lead to missed opportunities, but is generally not really harmful (because it implies being ahead).Currently, the only way not to repeat a bad engagement is via tracking of enemy units and their last known positions.Certainly more elaborate evaluations are required once more than one unit type are involved. 50 pts Copper League