Yes. you read that headline correctly: I have results! However, those expecting authentic sounding Grateful Dead – in whatever form that may take – will probably want to be waiting a lot longer. But if we view this whole process as akin to cooking a meal, I have at least sorted out the ingredients and cutlery, even if the food so far is somewhat lacking.

Again, our friend phase

So the basic approach was as outlined in the previous blog post. We build 2 computer programs, 1 to detect Grateful Dead music and the other to create some music. Then we put these 2 machines in an evolutionary arms race, as they should slowly get better at their jobs, and ultimately the generator should be able to create new Grateful Dead music (or at least, new music that you cannot tell is different).

The approach taken is not to use actual music (because this is difficult), but instead to do some processing on the sound beforehand. We actually split the audio into hundreds of time slices, and then work out all the sine waves of each slice. Since this is the format we input, it is also the format we output.

This gives us a problem with the output. Let’s imagine a piece of audio composed of 2 sine waves. It would look like this:

2 sine waves, representing a single sound

You can see that the starting point for both of these sine waves – the left hand side of the graph – is different. The red sine wave starts high up and the blue low down. This information is known as the phase of the signal.

The problem our generator has is that it generates sine waves but no phase information. Our generator then starts all sine waves at point 0 – the black line. But is has to do this for all the times we slice the audio up. The result is a set of broken sine waves, where the wave is “reset” at the start of every time slice:

Audio file with phase=0 at start of every timeslice

As you can see, this is most definitely not going to sound like what we want!

Simple solutions for the win

I really took some time to try and fix this issue. It is a known problem in audio processing and basically there is no fix for this. There have been some attempts using machine learning, but that would involve another huge amount of work. so instead, I did something very simple and it seemed to work. Quite simply, I just randomised the phase information, instead of setting it to 0 all the time. As can been in the diagram above, there is a repetition in the phase information at the start of every time step. This repetition really sticks in the ear. If you randomise the phase information, then you get something more like this:

Phase at every time stamp randomised instead of constant zero

This is not perfect, but now the results sound a lot better, and phase randomisation turned out to be not that hard to implement. With that out of the way, let’s move on to the results.

What we don’t expect

So essentially, the computer code is trying to replicate a certain style of music. If it were unable to incrementally get better, we might expect it to just produce random noise. In particular, we might expect white noise.

3 second sample of white noise

Or even pink noise, which is apparently what the sound crew used to test the GD’s audio system before a gig (pink noise is where the volume of the differing sound waves is equal):

3 second sample of pink noise

So for any success, we do don’t want to sound like these 2. What we do want is sound like the Grateful Dead. So here is a sample of the Grateful Dead in exactly the same format as my results – a 22050Hz mono audio sample:

Bonus points for guessing year, show or song!

The Results

After many hours of rendering audio, my program produced some samples. To the best of my knowledge, this is the first computationally created Grateful Dead audio to ever be generated (more epochs should be better):

Render after 1000 epochs

Render after 1500 epochs

Render after 2000 epochs

So – is this progress? It is certainly not the Grateful Dead, but on the other hand it is not white or pink noise. It is also – unlike my previous posts – actual audio. I count this as a partial success, but also an indicator the final goal is some distance away.

The Future

The obvious thing to do now is to increase everything – the amount of music I use as data, the length of time I spend processing, and the size of the output data. This will take some time. I have another approach up my sleeve that involves generating songs from a very low resolution and then increasing the resolution, as opposed to starting with a small piece of music and trying to make it longer. But that’s for another post!