Researchers at the Korea Advanced Institute of Science and Technology and Pohang University of Science and Technology have introduced a machine learning algorithm system, InstaGAN, which can perform multiple instance-aware image-to-image translation tasks — such as replacing sheep in photos with giraffes — on multiple image datasets. The paper InstaGAN: Instance-Aware Image-to-Image Translation has been accepted by the respected International Conference on Learning Representations (ICLR) 2019, which will take place this May in New Orleans, USA.

An image-to-image translation system is a system that learns to map an input image onto an output image. Unsupervised image-to-image translation has garnered considerable research attention recently in part due to the rapid development of generative adversarial networks (GANs) that now power the technique. Previous methods were not suitable for challenging tasks, for example if the image has multiple target instances or if the translation task involves challenging shapes. Last month Google AI researchers introduced a state-of-the-art model capable of realistically inserting an input object in a photo with both a reasonable position choice and an accurate prediction of the output object’s size, pose, shape, etc. The team behind InstaGAN, however, believed even this advanced method could be improved upon.

This new research is based on CycleGAN, a GAN variant which can learn to translate images without paired training data to overcome the limitations of one-by-one pairing of pix2pix in image translation. CycleGAN can automatically translate two given unordered image sets X and Y, but it cannot encode instance information in an image. CycleGAN results however are not ideal when translating images involving specific features of the target. The InstaGAN system overcomes this problem and combines instance information from multiple task targets.

InstaGAN combines the boundaries of the target while ignoring details such as colour. Compared to CycleGAN for image-to-image translation, InstaGAN is more successful in generating a reasonable shape for the targeted instance while preserving its original context.

The research has been accepted as an ICLR 2019 Poster Paper. Reviewers indicated the method was novel and solved information problems not solved by previous methods with a clear explanation; that results looked significantly better than CycleGAN and other baselines; and called for additional controlled experiments with the model.

The paper InstaGAN: Instance-Aware Image-to-Image Translation is on arXiv; and more information is available on the InstaGAN GitHub.