Implementation of GANs N’ Roses on Tensorflow:

Let’s build a simple DCGAN (Deep Convolutional Generative Adversarial Network) using tensorflow and nothing else (except for pillow).

But what’s a DCGAN?

DCGAN is a modified version of the vanilla GAN to address some of the difficulties with vanilla GAN such as: making the fake images look visually pleasing, improvement in the stability during the training process such that the generator won’t find a flaw in the discriminator by repeatedly outputting an image that fits the data distribution the discriminator is looking for, but is nowhere close to the real image.

This is the Discriminator architecture we’re trying to build:

Discriminator Architecture

It can be seen that it takes in an image as the input and output a logit (1 for true class and 0 for fake class).

Next, we have the Generator architecture, which consists of conv_transpose layers that takes in a set of random numbers as input and generates an image at the output.

Generator Architecture

The changes proposed by DCGANs taken directly from this paper are:

Replace any pooling layers with strided convolutions (discriminator) and fractional-strided convolutions (generator).

Use batchnorm in both the generator and the discriminator.

Remove fully connected hidden layers for deeper architectures.

Use ReLU activation in generator for all layers except for the output, which uses Tanh.

Use LeakyReLU activation in the discriminator for all layers.

Let’s start by collecting images of roses. One easy way to do this is to image search roses on Google and download all the images in the search results by using a chrome plugin such as ImageSpark.

We’ve collected 67 images (more would have been better) which are available here. Extract these images in the following directory:

<Project folder>/Dataset/Roses.

The code and the dataset can be obtained by cloning this repo on Github.

Now that we have our images the next step is to preprocess these images by reshaping them to 64 * 64 and scaling them to a value between -1 and 1.

We’ll begin by writing out functions that can later be used to build convolution, convolution transpose, dense fully connected layer and LeakyReLU activation (as it isn’t available on Tensorflow yet).

Function to implement convolutional layer

We use get_variable() instead of the usual Variable() to create a variable on tensorflow to later share the weights and biases among different function calls. Check out this post to know more about shared variables.

Function to implement convolution transpose

Function to implement dense fully connected layer

Leaky ReLU

The next step is to build the Generator and the Discriminator. Let’s first begin with our protagonist, the Generator. The Generator architecture we’ll need to construct is shown below:

Again, the Generator Architecture we’re trying to implement

The generator() function builds a Generator (dah!) with the architecture that’s in the above figure. The DCGAN requirements such as removing all fully-connected layers, using only ReLU at the Generator and using batch normalization have been satisfied.

Similarly the Discriminator can easily we constructed as follows:

The architecture required:

The Discriminator architecture

Again we’ve avoided dense fully-connected layers, used Leaky ReLU and batch normalization at the Discriminator.

Now for the fun part, training these networks:

The loss functions for the Discriminator and the Generator is shown below:

Discriminator loss (This must have a negative sign)

Generator loss

Where x represents the real images and z is the noise vector feed to the Generator.

We’ll pass the random inputs to the Generator, the shape of zin will be [BATCH_SIZE, Z_DIM], the Generator should now give BATCH_SIZE number of fake images at its output. The size of the Generator output will now be [BATCH_SIZE, IMAGE_SIZE, IMAGE_SIZE, 3]. This is the term G(z) in the loss function.

D(x) is the Discriminator that takes in the real images or the fake ones and is trained to differentiate between them. In order to train the Discriminator on the real images we’ll pass the real image batch to D(x) and set the target to 1. Similarly to train it on fake images (which come from the Generator) we’ll connect the Generator output to the Discriminator input using D(G(z)).

The loss for the Discriminator is implemented using tensorflow’s inbuilt functions:

We’ll next need to train the Generator such that D(G(z)) will output a one, i.e, we’ll fix the weights on the Discriminator and back prop only on the Generator weights such that the Discriminator outputs a one always.

The loss function for the Generator will therefore be:

We’ll next collect all the weights of the Discriminator and the Generator (this is later needed to train either only the Generator or the Discriminator):

We’ve used the tensorflow’s AdamOptimizer to learn the weights. The next step is to pass the weights that need to be modified to the Discriminator and the Generator optimizers respectively.

The last step would be to run the session and pass the required image batches to the optimizers. We’ll train the model for 30000 iterations and periodically display the Discriminator and the Generator losses.

In order to make the tuning of hyperparameters easier and to save the results at each run we’ve implemented the form_results function and created a file called mission_control.py .

All the hyper parameters for the network can be modified using the mission_control.py file and later running the main.py file will automatically create folders for each run and save the tensorboard files and the generated images.

We can have a look at the Discriminator and the Generator loss at each iteration during training by open up tensorboard and pointing it to the Tensorboard directory created under each run folder (Checkout the GitHub link for more details).

Variation of Generator loss during training

Variation of Discriminator loss during training

From these graphs it can be seen that Discriminator and the Generator losses constantly increase and decrease during the training stage indicating that both Generator and the Discriminator try to out perform each other.

The code also saves the generated images for each run and some of these images are shown below:

At the 0th iteration:

100th iteration:

1000th iteration:

The images are being over-fitted at the 30000th iteration:

The generated images at the training stage is shown below:

These images are promising, but after about 1000 iterations it can be seen that the Generator is just reproducing images from the training dataset. We can use a larger dataset and train it for lesser number of iterations to reduce overfitting.

GANs are easy to implement but hard to train without the right hyper-parameters and network architecture. We’ve written this article with the main intent to help people get started with Generative Networks.