Video: Robot recognition

Picture the scene, a few years from now. “Robot, fetch me that pillow over there,” you say to your ever-willing butlerbot. “Certainly sir,” it replies. “What’s a pillow?”

Hema Koppula and Abhishek Anand at Cornell University in Ithaca, New York, hope to avoid this disappointing scenario by teaching robots to understand the context of their surroundings so that they can pick out individual objects in a room. “We have developed an algorithm that learns to identify the objects in home and office scenes,” explains Koppula.

Key to the system is Microsoft’s Kinect sensor, which perceives real-world 3D scenes by combining two visible-light cameras with depth information from an infrared sensor. Koppula and Anand’s algorithm learns to recognise particular objects by studying images labelled with descriptive tags such as “wall”, “floor” and “tabletop”. The researchers used 27 labels in total, 10 each for office and home scenes and seven that applied to both.

Previous approaches to the problem have used expensive depth sensors that couldn’t provide colour information, but the cheap Kinect can do both, allowing the algorithm to consider both shape and colour when evaluating an object. The system also takes relative locations into account – for example, computer monitors are normally found on top of a table, rather than underneath.


It turns out office locations are easier to classify than home scenes, with Koppula and Anand’s algorithm achieving 84 per cent recognition success in the former versus 74 per cent in the latter. Koppula puts this down to the lack of variety in office environments compared with our more personalised homes. “Offices typically have a very ordered structure, whereas every home is very different,” she says.

Getting warm

To find out how the algorithm performed in a real-world setting, the researchers mounted a Kinect on a mobile robot and asked it to find a keyboard. As you can see in the video above, the robot begins by examining its surroundings. It spots a computer monitor and then moves in for a closer look, knowing that keyboards are often found nearby – an intuition that proves correct, as it finds and points at the keyboard.

The work demonstrates that robots can find objects, but how about teaching machines what a keyboard is actually for? “The next aim is to also include humans in the learning process, [with a robot] observing humans and being able to learn attributes of objects,” says Koppula. A robot could learn that if it sees a human sitting on an object, for instance, there is a good chance the object is a chair.

Daniel Huber, who researches computer vision at Carnegie Mellon University in Pittsburgh, Pennsylvania, says the work is a “nice advance” over previous efforts, partially thanks to Kinect. “The fact that these low-cost sensors are available is going to make a big difference in revolutionising the way people do computer vision.”