A dynamic image has a “wow” factor that defines spacial relationships and can express much more than a still photo. Creating such a moving photo however can involve taking multiple pictures from different points of view followed by a lot of manual editing work.

A new Adobe-developed AI tool significantly lowers the threshold for producing dynamic images with a framework that synthesizes a “3D Ken Burns effect” from a single image. It supports both a fully automatic mode and an interactive mode with the user controlling the camera. Researchers introduce the technique in the paper 3D Ken Burns Effect from a Single Image.

The Ken Burns effect is a technique for animating still images with a virtual scan and zoom. Adding parallax results in a compelling 3D Ken Burns effect. Compared to traditional methods, Adobe’s technique reduces the number of input images from many to one and saves significant editing efforts. Even an untrained photographer can produce the same effect as image professionals.

At the core of this research is a semantic-aware depth estimation function which involves three steps.

Estimating the coarse depth with a VGG-19 convolutional neural network to extract semantic information from a low-resolution image. Adjusting the depth map according to the instance-level segmentation by a Mask R-CNN neural network to ensure consistent depth values inside salient objects. Refining the depth boundaries guided by the input image while upsampling the low-resolution depth estimate.

Overview of depth estimation pipeline

Compared with other methods for producing the 3D Ken Burns effect, Adobe’s framework shows more consistent and natural object boundary inpainting — the generating of consistent content to fill the “holes” revealed a dynamic image due to perspective shift.

Comparing video synthesis results between DeepFill, EdgeConnect and the Adobe approach.

The Adobe research group conducted an informal user study to evaluate the technique. Participants were asked to create the 3D Ken Burns effect from provided images using the new Adobe tool and two other methods. The consensus was that Adobe’s tool generated higher quality images and had superior usability.

The first author of the research paper is Simon Niklaus, a PhD Student at Portland State University who majored in Computer Vision and Deep Learning. He developed this project while he was an Adobe research intern, and is now a research intern with Google. In a post on Hacker News, Niklaus says he plans to open-source the project but has not yet received permission from Adobe.

The paper 3D Ken Burns Effect from a Single Image is on arXiv.