Solution? Creative Adversarial Networks

The authors propose a modified GAN to generate creative content. They propose sending an additional signal to the generator to prevent it from generating content that is too similar to existing content. How did they do it? They modified the oritinal GAN loss function from Equation 1.4.

Intuitive explain of CAN

In the original GAN, the generator modifies its weights based on the discriminator’s output of wether or not what it generated was able to fool the discriminator. CAN extends this in two ways:

The discriminator will not only discriminate if it thinks the data is real or fake, but additionally will also classify which time period the artwork belongs to. The Generator will take in the additional information about the time period from the discriminator , and use that metric along with real/fake input from the Discriminator.

What is the point of doing this?

The original problem of GAN was they would not explore new work. Their objective is literally to just make their data look like it came from real dataset.

By having an additional metric which classifies the time period the data belongs to(along with the confidence), the generator is now getting feedback on how similar it’s creation looks to some time period.

Now, the generator not only has to make its data look similar to dataset, but also make sure it doesn’t look too similar to a single category. This will allow it to prevent creating artwork that has very specific characteristics.

The new loss function is:

Equation 2.0

Its really simple!

The first line is exactly the same as the original equation. Note that the subscript r means the discriminator’s output of real/fake, and the subscript c is the output of the discriminator’s classsification.

The 2nd line is the modification for promoting creativitity. I will explain it step by step.

Equation 2.1

This is the Discriminator getting the class of the input image correctly. The Discriminator will try to maximize this value. We want the discriminator to classify the images correctly.

Equation 2.2

This may look complicated , but this is just the Multi Label Cross Entropy Loss.Note that K here denotes the number of classes. You can find the detailed information about losses here. This is the same loss that classifiers use as a loss function. The generator will try to minimize this value in order to maximise Equation 2.0.

Intuitive explanation of Equation 2.2

The way that Equation 2.2 works is , if the value of one of the classes score approaches 1 or 0 , the value of the whole equation approaches -infinity. The largest possible value(larger is what the generator wants) that Equation 2.2 can take is when the discriminator is completely unsure about what class the input belongs to, i.e. every term in the summation has the same value. This makes sense because its not possible to properly classify the input image into existing classes, so it must mean that it is its own new class.

Conclusion

This paper talks about a loss function that pushes a GAN into exploring new content based on what is given to it. This was done by modifying the loss function to allow for exploration.

P.S.

This was my first technical post. Criticism and improvement tips are welcome and greatly appreciated.

If you learned something useful from my article, please share it with others by tapping on the ❤. It lets me know I was of help.