Lidars and cameras are two of the three standard sensors (along with radar) on almost all self-driving cars being tested today. Lidars and cameras both operate by detecting reflected light, but cameras are passive, whereas lidars actively send out laser pulses and measure the light that gets reflected back. Cameras produce a flat two-dimensional image, while conventional lidars produce a three-dimensional "point cloud."

Lidar startup Ouster has developed a clever and potentially significant hack: the company figured out how to make the sensors already on its powerful OS-1 lidar units function as a camera, producing a panoramic two-dimensional snapshot of the sensor's surroundings.

"The OS-1's optical system has a larger aperture than most DSLRs, and the photon counting ASIC we developed has extreme low-light sensitivity," Ouster CEO Angus Pacala writes. "So we're able to collect ambient imagery even in low-light conditions."

Even better, these snapshots actually have three layers. The first layer is an ambient-light image like you would get from a conventional camera. Meanwhile, a second, "signal" layer is based on light reflected from the sensor's laser pulses. Finally, there's a "depth" layer that provides the distance to each pixel in the other two layers.

Still, these images have significant limitations: they're low-resolution, and they're black and white, not color. And while conventional cameras capture light in visible wavelengths, Ouster's lidar captures light in near-infrared wavelengths—though Ouster says the images "closely resemble visible light images of the same scenes."

At first glance, using a $12,000 lidar sensor to capture low-resolution black-and-white snapshots might seem pointless. After all, virtually all self-driving cars being tested today are already festooned with conventional cameras capable of capturing much higher-resolution images—in color. But while these images aren't going to win any photography awards, Pacala argues that they could have significant value.

Why a lidar-camera could be useful

Self-driving cars need to merge data from different sensors into a single unified model of the vehicle's surroundings. This isn't easy. Cameras and lidars operate at different resolutions and frame rates. They're also necessarily mounted in different positions on the car, which means they capture a scene from slightly different angles. Sensors also get jostled and require periodic recalibration.

Some lidar makers have tried bundling a conventional camera together with a lidar unit. But Pacala argues that this still produces sub par results.

"We've seen multiple lidar companies market a lidar/camera fusion solution by co-mounting a separate camera with a lidar, performing a shoddy extrinsic calibration, and putting out a press release for what ultimately is a useless product," Pacala writes. "We didn't do that."

Instead, Ouster uses the existing light sensors on its lidar unit to also capture ambient light data—and then output them in a format that makes correlating the different types of data easy. The three layers captured by Ouster's lidar (ambient light, laser light, and depth) are perfectly aligned—in both space and time—so that the software processing the data always knows the exact distance to every pixel.

And Pacala argues that this format makes the data particularly useful for machine-learning algorithms. Machine-learning researchers have decades of experience doing image recognition on flat two-dimensional images using techniques like convolutional neural networks. Formatting lidar data as a two-dimensional pixel map allows customers to directly apply these powerful machine-learning techniques.

"We're able to feed these images directly into deep-learning algorithms that were originally developed for cameras," Pacala writes.

Convolutional neural networks are designed to work with multi-layered pixel maps—after all, conventional images can be thought of as having red, blue, and green layers. So it's straightforward to take a neural network designed for conventional color images and retrain it to recognize images with ambient, laser, and depth layers instead of red, blue, and green.

The engineers at Ouster have done just that. They've taken several conventional neural networks designed for ordinary RGB images and applied them to the pixel maps produced by the lidar "camera." "The networks we've trained have generalized extremely well to the new lidar data types," Pacala concludes.

In one case, Ouster fed intensity and depth data from a drive around San Francisco into a pixel-level classifier. The algorithm ran in realtime on an Nvidia GTX 1060 graphics card, and it appeared to do a good job of distinguishing other vehicles (red) and driveable roadway (yellow) from background objects.

Click the image to see an animated GIF:

This feature isn't going to fully replace conventional cameras, of course. There's no substitute for high-resolution color images. But building safe self-driving cars is all about maximizing redundancy. Cars use cameras and lidars and radars because these three sensor types have complementary strengths.

The new camera-like feature of Ouster lidars gives the sensors new capabilities that neither cameras nor conventional lidars had before. Having perfectly aligned depth information about every pixel in an image seems to provide information that can't be gleaned from either flat image or point clouds alone.