Core Wars Genetics: The Evolution of Predation John Perry (jperry@cs.ucla.edu)

I. Introduction

The domain of binary biology, or binology, has enough interesting properties that there is truly no need to go outside of its bounds to provide good A-life experiments. Processes can be born, die, move, reproduce, communicate, and do a host of other things without having to stretch to doing things like smelling. The abundance of computer viruses and worms demonstrate the ability of processes to take on characteristics of life. In fact, these programs are lifelike enough to scare computer users into taking precautions against them as they would against any parasite. This raises an important issue in the field of binology. If we are to study processes which may potentially run amok (and in fact unpredictable behavior is something we often hope for), we must be careful not to affect other people's processes lest we follow the fate of Robert Morris. One way to keep our experiments leashed is to work on an isolated computer. The problem with this is that a computer with no hard disk, no network, and no other users that is still reasonably powerful is a hard thing to come by. A more reasonable way to maintain control is to run our experiments in a controlled environment which is still quite like the overall environment. Core Wars is one such environment.

II. Core Wars as a Binological Model

instructions DAT - a data value, non-executable. MOV - move data from one address to another ADD - add two values and store in the second location SUB - subtract two values and store in the second location CMP - compare two locations and skip an instruction if they contain identical information JMP - unconditional jump to some address JMZ - jump to some address on the condition of some data value being zero JMN - jump to some address on the condition of some data value being non-zero SPL - split into two processes, one starting at the next address and one at a specified address DJN - decrement some data value and jump to some address if the value decremented is now zero addressing modes # - immediate $ - direct @ - indirect < - indirect with predecrement illustration 1: the Redcode instruction set

Depending on how one wishes to apply it, Core Wars is capable of providing a wide variety of different kinds of A-life experiments. Steen Rasmussen [1990] altered the rules slightly and came up with a self-contained evolutionary environment, sort of a binary version of the primordial soup experiment. A potential result of this type of experiment would be to determine whether reproduction or metabolism tends to evolve first within a random coreworld. That is, to find out if delineated programs with some function develop first, or auto-associative clusters of code. Bill Buckley has shown various reactions between warriors, and involving warriors which degenerate to strange attractors over time. This type of experiment shows that Core Wars warriors are capable of behavior much like the biological process of ontogeny. Both of these types of experiment use a Core Wars arena as a closed world {2}, or in other words the whole experiment goes on within the realm of the arena, however it is defined. An alternative is to view the Core Wars arena as an open world, or more specifically as the visible part of some larger system.

The larger system I have in mind is an evolutionary process. By applying a genetic algorithm to warrior populations, it should be possible to evolve warriors which perform some desired task. The most obvious task to attempt to evolve in this domain is survival. That is, start with randomly generated code and attempt to evolve programs that will thrive in a Core Wars arena. This survival could be with no other opponent, with a fairly benign opponent, or even with an opponent that is rather savage. Each of these conditions of the evolution would likely cause different survival strategies to come about. For my project, I wanted to go one step further. Rather than evolving something which may have ended up as prey for some other warrior, I chose to evolve a predator. John Holland, among others, has spoken of a biological arms race; my goal is to create the world's first binological arms race.

III. Evolution in Core Wars

One difficult problem with applying genetic algorithms to Core Wars consists of choosing the level of the gene. As in biology, it isn't clear what the optimal size is for the smallest information carrying unit of genetic code. Three distinct levels present themselves as viable candidates for the level of the binological gene: The bit, the instruction, and the meta-instruction. By meta-instruction, I am simply referring to an abstract chunk of code which has some function which will show up at the phenotype level. In biological genetics features like 'blue eyes', or 'eight legs', or 'black feathers' would be an example of a meta-instruction, and in fact this is the level at which most work in biological genetics is done. A binological meta-instruction should consist of more than one instruction since most interesting things that can be done by a computer can't be done by one instruction. An example of the meta-instruction level with respect to Core Wars is something like 'copy block' or 'DAT bomb' {3}.

The researchers on the Tracker project chose the bit as the gene level. They could do this because they were representing FSA's and neural networks, which are fairly robust. Thus it was difficult if not impossible to generate creatures that wouldn't execute at all. In Core Wars and other assembly languages it is much easier to generate illegal code, as changing a single bit can easily create an illegal instruction. Furthermore, in the initial tracker project the ants were represented by 40 bits, thus allowing 2^40 possible creatures. In Core Wars, each instruction is 40 bits long, so the space of creatures is considerably larger, depending on how many instructions one allows per warrior (see appendix A for calculations of the size of the space of my experiment). The best justification for the bit as the gene level of binology is that it is truly the smallest unit of information that one can change on a computer.

On the other hand, single bits aren't always critical in a functional sense on computers. Changing a bit in an unused data field may make no difference in how the instruction at that address executes, or in the functional performance of the program. The instruction level is the smallest level of binological representation which assures functional distinction in the phenotype. For this reason, I chose this level to be the genes in my experiments, but I allow mutations at a sub-gene level. In other words, when two warriors are being mated, the instruction level is the level of crossover, although mutation works at a bit level. The Tracker project researchers suggested that assembly language is not robust enough for this type of experiments, but I have to disagree. Not even humans have achieved a 100% survival ratio, so why should the things that we're evolving have that advantage? I propose that a high death rate in fact more closely resembles biological life. Certainly I may evolve a high number of warriors that die after executing only a few instructions, but don't babies occasionally fall victim to crib-death? And yes, two excellent warriors may happen to spawn a warrior that isn't nearly as complex or effective, but don't birth defects (mongolism, for example) still exist in humans and other animals? My point is that genetics has never been a very clean process and it will probably always be generating bitter fruit as well as good.

Another problem to consider is how to go about fitness testing. Obviously it can be done by a MARS, but what is the best thing to fight the evolving warriors against? The most realistic thing to do, in an evolutionary sense, is to fight all of the warriors of each generation against each other. In this way, all warriors would be fighting other warriors of approximately the same evolutionary complexity, and hopefully comparable fighting skills. Dawkins would like this scheme, since it would have no obvious evolutionary goal. The problem with this is that it is a fairly slow process, as nature has shown us, and it also requires a great number of battles. An alternative is to fight all of the warriors against a common opponent, preferably one with fairly good fighting skills. A previous entry in the Core Wars tournament would be a likely candidate. This should be considerably faster, both because the number of battles at each generation will be less and because there will be a clear evolutionary goal. Contrary to what Dawkins might say, this is a realistic strategy since many biological animals in our world have to learn to deal with man- made opponents (dodging cars, avoiding electric fences, etc.).

The next thing to consider is which warrior to choose. If we choose a warrior that is too weak or too strong it won't be useful. Every warrior gets the same fitness score if they all win or all lose. It would be better to fight two or three warriors at varying levels of strength, thus allowing the genetic warriors to vary more widely in their fitness scores. Furthermore, this may keep them from evolving a strategy which is only useful against one specific warrior. On this same point, it may help to vary some of the warriors between generations, hopefully to provide added diversity of strategies. In terms of the fitness score itself, it can use the same formula which is used for the tournament: 3 points per win and 1 point per tie {4}. This type of fitness scoring has a side effect that it evolves survival as well as predation. This is actually very helpful, since a program which attacks well, but can't survive isn't as good as one that attacks well and can survive.

IV. Experiments Conducted

In addition there was a very small percent chance (.05 %, to be exact) of mutation for each instruction being copied. Whereas mutation in Tracker consisted of simply flipping a bit, it was a much more complex process in my experiment. After discussing this with a few people who had done some similar work, I ended up allowing a small random chance (5 %) of each bit in the instruction to be flipped. Since there are 40 bits in an instruction, this means that I should have expected about 2 bits to have been flipped in an average mutated instruction. Holland [1975] suggested that a dominance change operator could be used reduce the damage done by generation of illegal instructions in situations like this, but I felt that it wouldn't significantly harm the experiment to allow a few illegal instructions to be generated, since they would only be fatal to the specific warrior containing that instruction, and only if it executes the instruction. I did all of these described genetic manipulations directly on object code files, but if I want to clarify what a particular piece of code is doing, I can always disassemble it into Redcode source code. In future experiments there are some things I may change, like the fact that warriors only live for one generation, but in general the paradigm I used is practical because it was easy to understand and apply.

Experiment 1: the tournament warriors

It turned out to be worse than I thought. I did the experiment on an IBM PC, which allows a maximum of 200 normal files in a directory. After this, it allows files to be created in that directory, but it actually stores them all in one file, which slows down their access time considerably. Since my program generated 1000 warriors for a population, this was very costly in terms of time. Moreover, my method of fitness testing wasn't very successful as well. I couldn't very well run each warrior against every other warrior, as this would amount to 999,000 battles, and I knew this. My proposed solution was to test the warriors in smaller groups, and then test the ones that had done well in these smaller groups against each other. I started a round of fitness testing at about 10:00 the first night, and when it was still running in the morning and was only about 2/3 complete, I knew I was in trouble. I ended up by doing fitness testing by choosing pairs of warriors, competing them, and taking down the results by hand. I did this for two generations of 80 warriors tested per generation, and then I ran out of time.

I took the two best warriors from the second generation, named C19 and C25 (child 19 and child 25), and entered them in the tournament. The results of the tournament were about what I expected; the two genetic warriors didn't win. But surprisingly, they didn't finish in last place either. Actually they placed second and third to last, but this means that they both beat a program that was written by a human who deemed it worthy of entry in the tournament. Furthermore, C19 beat one of the warriors that made it to the finals in a battle. This was notable performance for the first genetic warriors ever entered in the tournament, but not nearly as good as I had hoped to achieve.

This experiment was useful in a few ways, mainly as a critical review of my methods for the experiment which in general were crisis motivated and fairly poor. I used this information to improve my methods for the second experiment, and the results were much better. One of the bad methods was that my inadequate allocation of time didn't allow me to generate and test large enough populations. Eighty warriors per generation is not nearly a large enough sample size to even make a dent in the space I am exploring, and it also doesn't allow for sufficient variation between populations, somewhat like inbreeding in humans. Another problem came from my choice of warriors to seed the 0th generation. Since these warriors were significantly different in their design, they didn't tend to breed well, it was a bit like breeding dogs and sponges, or apples and scorpions. This meant that there were a vast number of progeny that were non-functional (i.e. that died after executing only a few instructions).

Experiment 2: the better warriors

I changed the method of fitness testing to test all of the warriors of each generation against a few known warriors, thus allowing me to do larger population sizes of 1000 warriors per generation. Two warriors which I chose to be used in testing all generations were selected for how they did in the 1990 International Core Wars tournament. QUARTER2 finished fourth to last, and thereby was chosen to provide an easy kill for the warriors. I felt this was necessary to get the early generations off the ground, and it worked out rather nicely. WANG1 was one of the finalists in the tournament (the finals results aren't out yet), and thus it was chosen to sort of play king of the mountain. Warriors that tied or defeated WANG1 would have to have something going for them. By battling WANG1 and QUARTER2 against each generation, I provided a consistent yardstick against which successive generations could be measured. Additional warriors I used for fitness testing included two past winners of the International Core Wars Tournament (FERRET and COWBOY) and three of the other four finalists from this years tournament (PWRBOMB, HYPRAY, and RNDMFIRE). All five of these other warriors were to play a similar role to that of WANG1, and also to help distinguish between the good and the very good warriors. By varying which additional warriors competed against the genetic warriors each generation I hoped to evolve more robustness, and greater variety of strategies.

Fitness testing still took a long time to run however, each generation taking about 24 hours to test. In addition, although the testing was done by the computer, I had to go through the results of the battles by hand, calculate the fitness scores (3 points per win and 1 point per tie), and determine which warriors would be the lucky parents for the next generation. This was somewhat time consuming, so I only completed three generations by the time of the writing of this paper. This is an ongoing project however, and I expect to complete ten to twenty generations before I make any bold statements about the validity of the experiment. I selected the top 10 % of each generation to be allowed to reproduce, but this was partially an ad hoc process, since there wasn't a clear line to break at in most cases between about the top 5 % and 20 %. In other words, in each generation some warriors did very very well, and many warriors did just pretty well. To do tie breaking between the ones that only did pretty well, I had to use arbitrary methods of choosing, which usually meant battling them against each other and watching for warriors which were particularly effective against their sibling warriors.

V. Results and Conclusions

On observing some of the later battles I noticed that a sort of speciation occurred. There were large numbers of certain types of warriors and then a few novel types. One type of warriors would deteriorate to a single instruction that was executing (probably a JMP $0, which just jumps to itself ad infinitum). I called this species the hiders. This strategy gained many ties against warriors that bombed at uneven intervals throughout memory, since they had trouble finding the one location in memory where their opponent was. Programs that copy themselves around the arena often copied themselves over one of these warriors, and the hider then comes out of its hiding place and mimics the other program. Hiders tended to get many ties, but only won by some strange chance, like as a side effect of mimicry. Hiders are particularly susceptible to be killed by warriors that bomb consecutively through memory.

A species that does that very thing are the ones that I have named blind-bombers. These warriors bomb furiously through memory, usually with a DAT bomb. If placed close behind a program like FERRET or RNDMFIRE one of these programs could strike quickly and get a kill, but if a program moves out of the way a blind-bomber will eventually bomb around the whole space (which addresses as a modulus ring) and kill itself. A variant of this species which my program evolved is the imp {5} blind-bomber . This amazing little program not only bombs consecutively with an imp, but it also evolved an imp stomper to catch the imps it creates. The fact that it evolved cooperative processes in such a small number of generations is quite a feat. Another variant of the blind bomber is the blind-JMP-bomber, which bombs consecutively through memory with some sort of jump instructions. This sounds simple enough, but it caused a few problems for some of the "yardstick" warriors, and also produced unusual battles. They were probably only unusual to me because this is not a commonly used strategy for human-programmed warriors. So far, I am pleased with the results of the second experiment and I hope that future generations turn out to be as interesting and enlightening as the first three.

VI. Future Work

One possible experiment would be to test for preadaptation by using only one opposing warrior to test against for many generations and then change the opponent. It would be interesting to see if the programs that evolved in this situation developed a specific strategy for their long term opponent, and if so, how long they retained any of that strategy after said opponent was gone. Other interesting experiments would be to attempt to evolve movement or self-reproduction. The problem with these potential experiments is how to go about fitness testing for self-reproduction or movement. Perhaps movement could be forcibly evolved by a certain kind of opposing warrior, but self-reproduction is not that simple. A feature which could enhance the power of this type of Core Wars experiments would be a disassemler.

The speed of my experiments and the size of the population could be vastly increased if I were to run the experiments on a more powerful machine (say for example a CM2). This would also generate a whole new set of problems to be solved like how to run the fitness testing in parallel and how to represent an 8K arena (8K 40 bit words, that is) in the 8K bit memory space of each cell. On the positive side, though, 200 generations in an hour sounds much better than 4 generations in a week. With a space as large as the one that I am exploring in this experiment, the additional size and speed should be a vast improvement in terms of area of space searched. If I can get fairly complex warriors in only 3 generations on an IBM PC, just imagine what kind of warriors could be evolved with a CM2.

Acknowledgements

Minimum Average Maximum I. Instruction mix of warriors of minimal, maximal, and average size: Initial DATs 0 5 9 Instructions 10 20 30 Final DATs 0 5 9 II. Number of possible warriors of minimal, maximal, and average size: a) By number of bits: 2^400 2^1160 2^1848 b) By legal instructions only: 2^390 2^1140 2^1818 c) What I actually allowed: 77^10 x 2^300 77^20 x 2^920 77^30 x 2^1476

a) By number of bit - In this case, the DAT instructions at the beginning and end will have 36 bits of unspecified information (4 of the bits indicate a DAT) and the general instructions will have 40 bits of unspecified information, so:

MIN - 10 instructions @ 40 bits/instr = 400 bits of unspecified data => 2^400 possible warriors

AVG - 10 DAT instructions @ 36 bits/DAT = 360 bits + 20 instructions @ 40 bits/instr = 800 bits = 1160 total bits of unspecified data => 2^1160 possible warriors

MAX - 18 DATs @ 36 bits/DAT = 648 bits + 30 instructions @ 40 bits/instr = 1200 bits = 1848 total bits of unspecified data => 2^1848 possible warriors

b) By legal instructions only - This restriction merely removes certain instruction types which are impossible. Specifically, there cannot be an immediate addressing mode for the A-fields of the JMP, SPL, DJN, JMZ, or JMN instructions or the B-fields of the ADD, SUB, or MOV instructions, since the former specify addresses to store data at and the latter specify addresses to branch to. The size of the A-field and B-field are each 16 bits, thus there are 32 bits of data per instruction.

DAT and CMP instructions -

2 different opcodes = 2

36 bits free = 2^36

=> 2 x 2^36 = 2^37 legal DAT or CMP instructions

JMP, SPL, DJN, JMZ, and JMN instructions -

5 different opcodes = 5

3 possible modes for A-field = 3

2 bits for mode for B-field + 32 bits of data = 2^34 => 3 x 5 x 2^34 = 15 x 2^34 legal branching instructions

ADD, SUB, and MOV instructions -

3 different opcodes = 3

3 possible modes for B-field = 3

2 bits for mode for A-field + 32 bits of data = 2^34 => 3 x 3 x 2^34 = 9 x 2^34 legal writing instructions

So for considering all possible legal instructions, there are a total of 2^37 + 15 x 2^34 + 9 x 2^34 = (8 + 15 + 9) x 2^34 = 32 x 2^34 = 2^39 legal instructions

MIN - 10 instructions @ 2^39 legal instructions = (2^39)^10 => 2^390 possible warriors

AVG - 10 DATs @ 36 bits/DAT = (2^36)^10 = 2^360 20 instructions @ 2^39 legal instructions = (2^39)^20 = 2^780 => 2^360 x 2^780 = 2^1140 possible warriors

MAX - 18 DATs @ 36 bits / DAT = (2^36)^18 = 2^648 30 instructions @ 2^39 legal instructions = (2^39)^20 = 2^1170 => 2^648 x 2^1170 = 2^1818 possible warriors

c) What I actually allowed - in addition to the restrictions to generate legal instructions, I also restricted the size of the A and B fields to (-8192,8192) since the size of the space I'm working in is 8192, and thus larger values would be redundant by modulus. This allows a total of 28 bits for the A and B data fields.

DAT & CMP : 2 opcodes = 2

28 bits of data field + 4 bits of addressing modes = 2^32 => 2 x 2^32 = 2^33 practical DAT or CMP instructions

JMP, SPL, DJN, JMZ, and JMN :

5 opcodes x 3 A-field modes = 15

28 bits of data field + 2 bits of B-field modes = 2^30 => 15 x 2^30 practical branching instructions

ADD, SUB, and MOV:

3 opcodes x 3 B-field modes = 9

28 bits of data field + 2 bits of A-field modes = 2^30 => 9 x 2^30 practical writing instructions

Therefore, considering all possible instructions, we have: 2^33 + 15 x 2^32 + 9 x 2^30 = (8 + 60 + 9) x 2^30 = 77 x 2^30 practical instructions

MIN - 10 instructions @ 77 x 2^30 practical = (77 x 2^30)^10

=> 77^10 x 2^300 possible warriors

AVG - 10 DATs @ 32 bits/DAT = 2^320

20 instructions @ 77 x 2^30 possible instructions = (77 x 230)^20 => 2^320 x 77^20 x 2^600 = 77^20 x 2^920 possible warriors

MAX - 18 DATs @ 32 bits/DAT = 2^576

30 instructions @ 77 x 2^30 possible instructions = (77 x 2^30)^30 => 2^576 x 77^30 x 2^900 = 77^30 x 2^1476 possible warriors

These figures are slightly misleading. They represent the number of possible warriors of a particular size, but I weighted my choices so that the range of the data fields of the branching and writing instructions tended to be smaller. In other words, an instruction that branches is more likely to branch a short distance than a long one, and an instruction that accesses data (either read or write), is more likely to access local data than data which is far away. I didn't have to do this, but I felt it would make programs that were more likely to have looping structures, and were also more likely to be self modifying.

Generation 0 1 2 3 Number of warriors 1000 1000 1000 1000 # of ties vs. QUARTER2 89 165 238 167 # of wins vs. QUARTER2 2 19 67 182 # of ties vs. WANG1 4 18 94 81 # of wins vs. WANG1 3 10 5 16 # of warriors with at least one 92 228 314 354 win or tie vs WANG1 or QUARTER2 # of warriors with at least one 4 28 71 182 win vs WANG1 or QUARTER2 # of ties vs. COWBOY d* 19 d 25 # of wins vs. COWBOY d 11 d 47 # of ties vs. FERRET d 2 d 12 # of wins vs. FERRET d 33 d 82 # of ties vs. PWRBOMB d d 35 16 # of wins vs. PWRBOMB d d 7 20 # of ties vs. HYPRAY d d 6 d # of wins vs. HYPRAY d d 184 d # of ties vs. RNDMFIRE d d 125 d # of wins vs. RNDMFIRE d d 40 d ratio of wins per battle .2 % 1.825 % 6.06 % 6.94 % ratio of ties per battle 4.65 % 5.1 % 9.96 % 6.02 % # of multi-battle winners 1 9 49 76 (vs all warriors battled) * - a 'd' indicates that the generation in that column did not play against the warrior in that row.

MICE degenerates to a strange attractor after about a million cycles in the Core Wars space

Collins, Robert J. Tracker . Technical paper of the UCLA computer science department (I'm not aware of any publication of this paper), Los Angeles, 1989.

Dawkins, Richard. The Selfish Gene . Oxford University Press, Oxford, 1976.

Dawkins, Richard. The Blind Watchmaker . W.W. Norton & Company, London, 1986.

Dawkins, Richard. The Evolution of Evolvability . From Artificial Life . Langton, C. (ed). Addison-Wesley, New York, 1989.

Dewdney, A.K. In the Game Called Core War Hostile Programs Engage in the Battle of Bits . from Scientific American , May, 1984.

Holland, John H. Adaptation in Natural and Artificial Systems . The University of Michigan Press, Ann Arbor, 1975.

Jefferson, David. The Genesys System: Evolution as a Major Theme in Artificial Life . Talk given at the Second Workshop on Artificial Life, Santa Fe, 1990.

Perry, John R. Predatory Programming . from The Core Wars Newsletter, Fall 1988.

Rasmussen, Steen. The Coreworld: Emergence and Evolution of Cooperative Structures in a Computational Chemistry . Talk given at Second Workshop on Artificial Life, Santa Fe, 1990. The associated paper to this talk credited Carsten Knudsen, Rasmus Feldberg, and Morten Hindsholm with co-authorship.

Stork, David G. Non-Optimality via Preadaptation in Simple Neural Systems . Talk given at the Second Workshop on Artificial Life, Santa Fe, 1990. The associated abstract also gives co-authorship credit to Bernie Jackson, Mark Burns, and Scott Walker.

Wicken, Jeffrey S. Evolution, Thermodynamics, and Information; Extending the Darwinian Program . Oxford University Press, New York, 1987.

Wilson, Stewart W. "The Genetic Algorithm and Simulated Evolution." From Artificial Life , Langton, C. (ed). Addison-Wesley, New York, 1989.

{ } I was careful not to use the term closed system here, since Steen's system constantly pumped new random data into the system, thus maintaining an influx of energy in to the system.

{ } Here, 'copy block' means copying a block of data from one location in memory to another location and 'DAT bomb' means copying DAT instructions throughout memory. Both of these are things which are commonly done by warriors.

{ } To be true to Holland's definition of a genetic algorithm, this value could be normalized by dividing by the maximum value to get a number between 0 and 1, but I chose not to do this. My method was actually considerably more ad hoc, and it will be described later.

{ } An imp consists of the single instruction "MOV $0 $1". This instruction, if executed by a process, will copy itself to the next location in memory. Then on the next clock cycle, the process will execute that copy and move itself ahead again, and so on. This is the smallest moving program in Core Wars.

A-G-E-Y-S-V-P-T-S and n-e-g-j-w-a-h-r-x

then if a "crossover" is done at the 3rd gene, they will produce the following pair of child programs:

A-G-E-j-w-a-h-r-x and n-e-g-Y-S-V-P-T-S

The other common reproduction operation is mutation. In mutation, some piece of one of the child warriors is altered slightly. Thus if mutation was also applied on the first child in the previous example, the resulting children may look like:

A-G-F-j-w-a-h-r-x and n-e-g-Y-S-V-P-T-S

The object in GAs is to start off with "ramdom" code and to let the GA evolve code to perform a desired task. Since programs are rewarded for how well they perform on the desired task, each "generation" tends to do better overall on the desired task.