By hallucinating the existence of humans, robots can learn how to better cater to people and understand our world, researchers say.
For robots that interact with and react to the world around them as best as possible, scientists want to design machines that know what’s in their surroundings. Attempts to identify an item based on its appearance can run into a gauntlet of problems — for example, an object might differ in how it looks over time depending on the lighting or the angle the robot views it from, and one item might differ enough from a similar one when it comes to color, size or shape to confound a droid’s limited knowledge.
One way researchers want to improve a robot’s ability to identify objects is to help it recognize the context an item lies within. For instance, if a machine recognizes a setting as a kitchen, that could help it figure out that objects on a countertop might be cups, bowls or utensils.
Roboticist Ashutosh Saxena and his colleagues at Cornell University’s Personal Robotics Lab reasoned that people are likely the most important factors for robots to keep mind in the places where they work, since those environments are typically designed around human use. As such, when people are not actually there for reference, hallucinating their presence could help provide key context for the machines.
"I remembered passages by a few authors from my high-school reading, such as Charles Dickens, whose description of environments was so vivid and connected to people — for example, the description of Miss Havisham’s room in Great Expectations — that it created a picture in my mind," Saxena says.
The scientists experimented with a droid roughly the size of a full-grown adult that can roll around rooms on wheels, grasping items with two claw-tipped arms. The robot is named Kodiak — because Cornell’s mascot is a bear, all their robots are named after members of the bear family.
Kodiak sees the world in three dimensions using a Microsoft Kinect camera, which is armed with an infrared scanner that helps create 3-D models of items. Although the Kinect was originally developed for video gaming, roboticists now use the sensor to help robots navigate rooms.
The researchers had Kodiak imagine adult-sized 3-D stick-figure versions of people that were arranged in three sitting and three standing poses against backdrops of full 3-D scans of offices and homes. The robot does not model every position humans might occupy in a scene, only the most likely ones — for instance, it does not imagine people floating in the air or standing on tables or chairs.
"What we found surprising was how much structure there is in human environments because of humans," Saxena says. "Just by looking at empty scenes, our algorithm learns about how hallucinated humans use the objects in the environment."
Using these hallucinations proved helpful when it came to robots identifying what they saw.
"For robots arranging environments — placing objects in correct places — the average error in the correct location of the object was reduced by more than one foot, which resulted in about 95 percent of placements to be correct as compared to about 70 percent when not hallucinating humans," Saxena says.
For instance, they could recognize the configuration of chairs, tables, monitors and keyboards as commonly found in offices, which can naturally be explained by a posed human avatar in the chair working on the computer. Small handheld devices are typically held close to people, while certain objects like televisions are often used at a distance.
"The most rewarding point was that when our algorithm discovered how humans use objects without actually seeing humans using it," Saxena says.
Future applications of such research could be robot butlers and maids that help pick up after people. In addition, “I could speculate that a self-driving car could hallucinate where people could go, so as to safely drive more conservatively in those situations,” Saxena says.
The scientists will detail their findings on June 25 at the Robotics: Science and Systems Conference in Berlin and on June 27 at the Computer Vision and Pattern Recognition in Portland.