Article | Posted on May 6, 2014 Creating Android Lens Blur Depth Viewer Reading time: 5 minutes Topics: JavaScript, Android Lens Blur See the demo Get the code on GitHub

Going from an update to the Android camera app to a 3D viewer on the browser, in a few hours and using JavaScript. The latest update to the Android camera app provides images with embedded depth data. With a bit of engineering and tinkering with an hex editor, a text editor, some JavaScript and a browser, it's easy to get a 3D model representing the original image.

A bit of reverse engineering

A few weeks ago (April 2014) google updated the Android camera app, with a revamped UI and a few nice additions, amongst which there's Lens Blur, a new camera mode to take pictures with depth. It uses a bit of optical flow and other computer vision algorithms to estimate depth out of a picture and video taken with a bit of upwards movement. The resulting image can be used afterwards to perform an effect of shallow depth-of-field, by clicking the area of the picture that you want focused, and specifying the radius of the blur for the blurred parts.

@mrdoob was asking on twitter where that extra depth data was stored, because he couldn't find it in the filesystem. I trust his computer ma skills so if he doesn't find a file he's looking for, it's because it's not there. So where is the depth data stored? The logical answer is "together with the image file, somewhere inside it". So let's all set sail for treasure!

First thing: get a picture taken with the new app to test. I actually had to wait for a while to get the update, but there were several pictures already on the web. They are and look like regular JPEG files.

A few years ago I used to work a lot with MJPEG streams and weird video formats based on JPEG sequences with boundaries, so I'm pretty familiar with the JPEG structure of blocks. Using HexFiend to get an idea of the layout of the contents of the file and start working on a loader, extracting pieces and decoding from base64 content that looked promising.

Finding the depth map

Using XMLHttpRequest to load the image as a byte array, we can split it by boundaries, and work with each chunk depending on its content. Some content is binary data, but other is simply a XML-formatted string. There's a couple of interesting blocks of metadata: one contains info about the picture and how it was taken, another info about the actual depth metrics. And contain a base64-encoded string with the depth map.

The depth map is a grayscale PNG, with the same dimensiones as the original image. It can be easily extracted and used as source of an image element to display in the browser.

That's where we can start creating a nice little library to extract depth images from Lens Blur pictures.

From image to mesh

The next logical step is to use the color image and the depth image to reconstruct the 3D scene which picture was taken. To do that seems pretty straightforward: we can create a point-cloud, each point being a pixel from the color image. We go over all pixels in the color image and for each pixel we create a vertex where:

x is horizontal position of the pixel on the image (0 to image width),

is horizontal position of the pixel on the image (0 to image width), y is vertical position of the pixel on the image (0 to image height),

is vertical position of the pixel on the image (0 to image height), z is the pixel from the depth map (0-255)

As said, the depth map is a grayscale image, so each pixel has 256 possible values: depth is encoded as any of those discrete values and it has to be remapped to the right values.

At this point we have the first version of the 3D player, which looks boxy and really not quite right.

Correct mesh reconstruction

First thing, the depth doesn't look correct because it's not stored as a linear value. There are two possible ways XMP stores depth data: RangeLinear and RangeInverse. These are two ways of encoding the distance. The header -and the google documentation- states that is RangeInverse, so the value once read from the byte has to be transformed using ( far * near ) / ( far - value * ( far - near ) ). The far and near value are available on the GDepth header.