A self-modifying C# application that uses a genetic algorithm to draw Banksy's Girl With Balloon

Published 10th June 2016 Progress updated 13th June 2016 19:27 GMT

With a million monkeys typing on a million keyboards, how long would it take them to produce the complete works of Shakespeare? Using a genetic algorithm, it might only take half of infinity.

With Visual Studio 2015 coming with support for C# Scripting, I decided that it would be an excellent time to bring the Singularity/Skynet a little closer to reality by creating an application that can improve itself. By only giving the application the capacity to select shapes to draw, though, I have hopefully averted catastrophe. If, however, you are reading this blog in the comfort of your own Matrix pod with sentinels patrolling outside for pro-human dissidents - oops.

What is C# scripting?

A script is a program that isn't precompiled - it's a text file that is interpreted at runtime. Project Roslyn has been the effort over the past few years to make available the parsers that Visual Studio uses to interpret C# and VB. As part of this effort, we have gained the ability to execute C# that is stored in a text file (or even just created on the fly). Isn't that cool?

What is a genetic algorithm?

A genetic algorithm is a method of refining something by mimicking the survival-of-the-fittest driver of evolution. Almost anything man-made has gone through some kind of genetic algorithm to arrive at its current form. �Okay guys, how about this? Seriously, listen. How about we put a hole in the toilet seat?� The mutations in those cases, however, are often intentional - an example of guided mutation.

Then there is random mutation. Why do you exist? Because every single one of your ancestors was ever-so-slightly better at surviving long enough to procreate than the competing organisms that didn't. In theory, every generation should be better than the previous generation. Having just spent an hour on a bus with teenagers, though, I'd like to have a word with Charles Darwin.

Let's take a closer look at reproduction. Skip this paragraph if you're underage.

1) Mutation In organisms, this happens during meiosis and fertilisation. The genes of each parents are chosen at random to pass on to the next generation, with a pinch bit of mutation added for good measure. Our C# application reproduces asexually, though, because I'm of the opinion that unmarried applications not be having sex. Call me old fashioned if you like. The way that our script mutates is by taking the most recent successful script and either adding or removing a random line of code. The new lines of code will either introduce a new Ellipse or Rectangle. Each new shape will either be drawn in Red, White, or Black - also chosen randomly.

2) Competition In nature, competition for food and security form the most immediate factors - if you're dead, you can't reproduce. Then comes the need to find a mate who actually wants to reproduce with you. We won't go into that here. Our script has simplified these selection criteria. Each script produces a picture that is the same size as Girl With Balloon. We then compare that picture with Girl With Balloon and sum the differences between each pixel. for (int x = 0; x < target.Width; x++) { for (int y = 0; y < target.Height; y++) { original = target.GetPixel(x, y); variant = wannabe.GetPixel(x, y); _score += Math.Abs(original.R - variant.R) + Math.Abs(original.G - variant.G) + Math.Abs(original.B - variant.B); } } Then, to find the most successful of the children of that generation, we do the following: static void FightToTheDeath(List<Child> children, ref long daddysScore) { var winner = children.OrderBy(x => x.Score).First(); foreach (var child in children.Where(x => x.Index != winner.Index)) { child.DarwinAward(); } if (daddysScore > winner.Score) { daddysScore = winner.Score; winner.Survive(); } else { winner.DarwinAward(); } }



This will continue until either until a generation scores 0 (no differences between it and Girl With Balloon), or our sun burns out. More on this below.

So when will it complete?

To work this out, I sought out my first year university Statistics textbook. It was busy serving as the counterweight for a construction crane.

After having reinforced my table, I flipped it open and started to read. When I woke up, I had some coffee, sprinkled my chair with drawing pins, and tried again.

Now, to work out when we expect something to happen, we divide the number 1 by the probability of that event occurring. For example, if we flip a coin, we have a 50% chance of it coming up as heads. That means that we expect heads to come up on the second flip. If we do this experiment enough times, the average number of times that we had to flip the coin before getting head will be 2.

In this article, I'll only work out the expected value of the first success because that's by far the simplest scenario and I don't want anybody's death-by-boredom on my conscience.

We start off with a plain white canvas. We then draw a shape on it in either red, white, or black. A generation is successful if it corrects more pixels than it messes up.

For the first generation, a black or red shape will bring about a success if it covers more pixels that should be black or red than it does pixels that should be white. The probability that any one pixel should be black or red is the same as the number of black/red pixels in Girl With Balloon divided by the total number of pixels. There are 3774 red pixels, 13880 black pixels, and 319846 white pixels in Girl With Balloon.

What does this tell us? It tells us that the probability of any one pixel anywhere in the picture being black is 13880/337500, which gives us just over a 4% probability. Add to that the probability of a pixel being red - just over 1%, and we end up with a probability of 5.2% that a pixel is not white.

What do we do with this information? Well, given that we're drawing a red shape, we will have a 5.2% chance that the generation will succeed. We have the same chance if we're drawing a black shape. If we draw a white shape, however, the chance is 100% that the generation will fail because there will be no improvement on the white canvas that we started with. That means that there is a 66% chance of a 5.2% chance of success. That gives us a probability of success for the first generation of 3.4%, with us expecting a success on the 28th generation.

As it turns out, our first success was the 30th generation, but if we ran that first generation over again enough times, we'd end up getting an average of 28.

As you will see in the graphs below, the probability of success decreases with each successful generation.

What is the current progress?

We are 431399 in, with 358 generations having produced an improvement on the previous generation.



As you can see from the above graph, the progress seems to be slowing down - each successful generation causes less and less of an improvement.



As you can see from the above graph, it is taking longer and longer for a generation to be successful.

Below, we have the most recent generation. Click on it to see a gif of the progress.

What have we learned?

As an entity becomes more evolved, it becomes harder and harder to approve upon it - if its competition remains the same - especially using a random mutation. Look at the crocodile - unchanged for 55 million years because it's so perfect that any mutation made it less efficient. However, we have showed that it is possible to create an application that is able to modify itself to get closer to a specified goal.

So what does that mean for the future? It means that it may be possible to put me out of work - if we create a set of, for example, unit tests against which we can score an algorithm, it may be possible to get some software to write itself. The mutation mechanism will definitely need some refinement - I have specifically ensured that the mutation routine of this application doesn't know what the goal is.