Keeping up with the GANs

Guide to the key GAN papers for busy nerds.

Choosing the right paper to read can be difficult

Intro

Fret not when you are bombarded with so many papers on Generative Adversarial Networks (GANs). Truth is, reading a small fraction can already give you a good understanding of the subject area (unless you want to go deeper). Here, I have selected the key GAN papers based on number of citations on Google Scholar.

I have also compiled a list of well-written articles on GANs, which you will see at the end of this post.

How to make this article work for you

All papers link to nurture.ai. Click on the heart at the top right corner to save the paper for later reading.

Quick overview of GANs

GANs were first introduced by Ian Goodfellow in 2014 and have generated hype in the research community. Chris Olah described GANs perfectly in his Twitter post:

Snapshot of twitter post.

GAN reading roadmap

Start you paper reading journey with the first ever paper on GANs:

Generative Adversarial Networks (2014) [3021 citations]

Another good way to get started is with overview papers, which summarises the research frontiers and developments of GANs. Here are two:

NIPS 2016 Tutorial: Generative Adversarial Networks (2016) [194 citations]

What is better than reading an explanation on GANs by the creator of GANs himself? I particularly like the “Start with Why” Simon Sinek approach, where the paper starts with a discussion on why generative modeling is a topic worth studying.

Generative Adversarial Networks: An Overview (2017) [13 citations]

Explanation of GAN architectures, training techniques and unsolved challenges. This paper is originally intended for the signal processing community, but is easily understood by a general audience. A recommended read for those who want to have a comprehensive understanding of GANs.

GANs create

GANs are generative models; they will try to create whatever you feed them. This means GANs must have a good understanding of their inputs (or more specifically, the distribution of input values). A quote that resonates well with this is by Richard Feynman :

What I cannot create, I do not understand.

Such understanding is a form of unsupervised learning, which is highly regarded by top researchers as it more closely resembles how we learn. The use of GANs in unsupervised learning was first discussed in a feature extractor called the DCGAN, found in this paper:

Unsupervised representation learning with deep convolutional generative adversarial networks (2015) [1459 citations]

Other papers that use GANs in unsupervised learning problems:

InfoGAN: Interpretable Representation Learning by Information Maximizing Generative Adversarial Nets (2016) [374 citations]

InfoGAN learns disentangled representations in a completely unsupervised manner. For example, it separates writing styles from number shapes on the MNIST dataset; it also identifies hairstyles, presence of eyewear and emotions from celebrity faces.

Stacked Generative Adversarial Networks (2017) [207 citations]

(Extracted from Introduction) In this paper, we propose a generative model named Stacked Generative Adversarial Networks (SGAN). Our model consists of a top-down stack of GANs, each trained to generate “plausible” lower-level representations conditioned on higher-level representations. Similar to the image discriminator in the original GAN model, which is trained to distinguish “fake” images from “real” ones, we introduce a set of representation discriminators that are trained to distinguish “fake” representations from “real” representations.

StackGAN: Text to Photo-Realistic Image Synthesis with Stacked Generative Adversarial Networks [79 citations]

This paper also adopts a stacked approach in generating images. The improvement here is that the architecture used is simpler and produces 256×256 images with photo-realistic details (compared to 32×32 in the previous paper).

Generative Adversarial Text to Image Synthesis (2016) [352 citations]

Creates images of bird and flowers from text descriptions. This is achieved by training a Deep Convolutional GAN conditioned on a learned text feature representation. However, images generated are of low resolution. The paper above aims to address this via a stacked framework.

Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network (2016) [515 citations]

Enhances the resolution of an image by adding an adversarial loss to a content loss.

Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks (2017)[399 citations]

Introduces a technique for image translation G:X > Y by coupling it with an inverse mapping F:Y > X.

Coupled Generative Adversarial Networks (2016) [152 citations]

A CoGAN model that learns a joint distribution of images from different domains. A friendly read of the paper can be found here.

Synthesizing the preferred inputs for neurons in neural networks via deep generator networks (2016) [95 citations]

Produce high quality images at a high resolution using Deep Generator Network-based Activation Maximization (DGN-AM).

Plug & play generative networks: Conditional iterative generation of images in latent space (2016) [70 citations]

Introducing a technique that overcomes the weakness of DGN-AM, which generates samples that lack diversity. This is done by adding a prior on a latent code.

Generative Multi-Adversarial Networks (2016) [15 citations]

What happens when you introduce a GAN with multiple discriminators?

GAN Training techniques

Every network has its own flaws, and GANs are of no exception. Two of the largest headaches from training GANs are in producing good quality and high diversity images. Other problems include vanishing gradients, failure to converge and generator producing very similar samples.

Wasserstein Generative Adversarial Networks (2017) [553 citations]

Introducing WGANs, a variant of GAN that improves training by introducing a smooth metric to quantify the difference between two probability distributions. This paper also introduces what it means to “learn” a probability distribution. This blog post is a good supporting material that explains the mathematics behind GANs and also WGANs.

Improved techniques for training GANs (2016) [708 citations]

Aims to tackle the problem of non-convergence during GAN training (i.e cost function of generator and discriminator cannot be simultaneously minimised). The result leads to improved performance in semi-supervised learning and sample generation.

Towards Principled Methods for Training Generative Adversarial Networks (2017) [213 citations]

This paper aims to address and analyse issues during GAN training. The issues include updates to the generator getting worse as the discriminator gets better, unstable training and use of an alternative cost function.

Least Squares Generative Adversarial Networks (2017) [89 citations]

Proposes a least squares loss function for the discriminator, which is able to generate samples that are closer to real data.

Unrolled Generative Adversarial Networks (2017)[53 citations]

Aims to stabilize Generative Adversarial Networks (GANs) by introducing a surrogate objective during update of the generator.

Generative Adversarial Networks (GANs): What it can generate and What it cannot? (2018) [too new for citations]

A comprehensive overview of the challenges in training GANs and methods to overcome them.

Conditional GANs

A variant of the original GAN that adds additional information to inputs of the generator and discriminator.

Conditional Generative Adversarial Nets (2014)[536 citations]

Introduction to conditional GANs, where both the generator and discriminator are conditioned on some extra information y, which could be class labels or data or other forms.

Image-to-Image Translation with Conditional Adversarial Networks (2017) [663 citations]

Using conditional GANs on image-to-image translation problems.

Latest papers on GANs

2018 has seen new and interesting approaches of GANs. Here are our picks:

Are GANs Created Equal? A Large-Scale Study (2018)[14 citations, 1 tweet]

This paper was recommended by Ian Goodfellow in a twitter thread. It proposes quantitative methods to compare GAN models.

An Introduction to Image Synthesis with Generative Adversarial Nets (2018) [too new for citations]

(Taken from abstract) In this paper, we provide a taxonomy of methods used in image synthesis, review different models for text-to-image synthesis and image-to-image translation, and discuss some evaluation metrics as well as possible future research directions in image synthesis with GAN.

Evolutionary Generative Adversarial Networks (2018) [1 tweet]

Aims to improve GAN training stability and quality of generated images by evolving a population of generators to adapt to the discriminator.

Synthesizing Audio with Generative Adversarial Networks (2018) [1 citation]

Introducing WaveGAN, a first attempt at applying GANs to raw audio synthesis in an unsupervised setting.

Social GAN: Socially Acceptable Trajectories with Generative Adversarial Networks (2018) [too new for citations]

GANs can simulate possible futures using time series data. Leveraging on this, authors of this paper used a GAN to predict socially acceptable human motion trajectories. Their approach could potentially be used in self driving car and social robots.

Spectral Normalization for Generative Adversarial Networks (2018) [16 citations]

Aims to improve GAN training stability by a weight normalisation technique

Articles with good explanations on GANs

GANs in layman terms from Analytics Vidhya article

Blog by OpenAI: Overview of generative models article

Fantastic GANs and Where to Find Them: Evolution of GANs article

Am I missing anything else? Feel free to comment.

Rowen is a research fellow at Nurture.ai. She believes the barrier to understanding powerful knowledge is convoluted language and excessive use of jargons. Her aim is to break down difficult concepts to be easily digestible.