More efficient machine vision technology modeled on human vision

Prof. Robert Dick and advisee Ekdeep Singh Lubana developed a new technique that significantly improves the efficiency of machine vision applications

Enlarge Enlarge (left image) The conventional uniform-resolution sampling and processing approach used in most machine vision applications. This is appropriate in image reproduction applications, where aesthetics are important, but not in energy-constrained machine vision applications. (right image) By sampling irrelevant background information at low resolution and regions of interest at high resolution, energy consumption is dramatically reduced while preserving accuracy.

×

Prof. Robert Dick and Ekdeep Singh Lubana at the University of Michigan have developed a more efficient technique for machine vision by modeling it on human vision. By simply manipulating a camera’s firmware, the technique cuts energy consumption by 80% and has almost no impact upon accuracy when used for practical vision applications, like license plate recognition and facial recognition. Called “Digital Foveation,” it’s also faster than conventional methods. “It’ll make new things and things that were infeasible before, practical,” Dick said. “Instead of having to change a battery once a week, for example, it’ll work for five weeks.” Whether helping inspect and sort products or determining whether the object in front of a driverless car is a pedestrian or a paper bag, machine vision plays a critical role in our increasingly automated life. It has had a tremendous impact on security, healthcare, banking, transportation, and industry, and its applications are expected to have a market value of $15.46 billion by 2022.

It'll make new things and things that were infeasible before, practical. Prof. Robert Dick

However, today’s machine vision is limited. Due to the power and computational demands of machine vision algorithms, even the simple task of recognizing license plates can be demanding for a computer. Currently, when computers visually analyze a scene, they use a camera to capture a uniform, high-resolution image, and then they transfer the data over to an application’s processor. The processor runs image classification algorithms, and it costs a lot of energy and time to transfer and process all the data. Humans, however, gather and process visual information in a far more efficient manner. Human retinas have only a small area supporting high-resolution vision, which is a central, dense sensing region called the “fovea.” By looking around, humans are able to see different parts of their surroundings at different resolutions, capturing the most important areas at high resolution while using low-resolution capture for less-important details. This information is then pieced together to draw inferences about the scene.

Enlarge Enlarge Ekdeep Singh Lubana evaluating the energy consumption, performance, and accuracy implications of Digital Foveation.

×