All pretrained loss networks used in these experiments were downloaded from tensorflow slim-models repository. inception-v3 trained on openimages was obtained from this script.

The code used in these experiments is available on github.

To find out, which layers I mean by Conv2d_2c_3x3, Mixed_3b etc for inception-v1, run

in the repo. Similarly for inception-v2, inception-v3, inception-v4, vgg-16 and vgg-19.

Tweak #1: Removing checkerboard artifacts

Checkerboard artifacts can occur in images generated from neural networks. They are typically caused when we use transposed 2d convolution with kernel size not divisible by stride. For a more in depth discussion on checkerboard artifacts, read this post.

Backpropagation through a convolution is transposed convolution. Thus when training an image using a loss network, checkerboard artifacts can occur when the loss network has a convolution layer with kernel size not divisible by stride.

In inception-v1, v2, v3 and v4 architectures. The first layer has stride 2, and kernel size 7 (in v1 and v2) or 3 (in v3 and v4). Both 7 and 3 are not divisible by 2. So there was a possibility that checkerboard artifacts will be created here.

To check whether this was the case or not, I trained a noise image on only the content loss using Conv2d_1a_7x7 (from inception-v1 architecture). This image was generated. This looks, normal but on zooming in, checkerboard artifacts become visible.