The development of Home Robots could be accelerated by developing a realistic virtual environment in which future home Robots could develop and perfect their skills.
Researchers at MIT, Nvidia, and ETH Zurich developed Watch-And-Help, a challenge in which embodied AI agents need to understand goals by watching a demonstration of a human performing a task and coordinating with the human to solve the task as quickly as possible.
In the first phase of WAH, which the researchers call the Watch stage, an AI agent observes a humanlike agent perform a task and infers a goal from their actions.
In the second stage – the Help stage – the AI agent assists the humanlike agent in achieving the same goal in a completely different environment.
The researchers assert that this two-stage framework poses unique challenges for human-AI collaboration because the AI agent has to reason about the humanlike agent’s intention and generalize its knowledge about the goal.
To enable the kinds of interactions involved in WAH, the researchers had to extend the open source platform VirtualHome and build a multi-agent environment dubbed VirtualHome-Social.
VirtualHome-Social simulates home settings so agents can interact with different objects and agents, for example opening a container or grabbing a utensil from a drawer.
VirtualHome-Social also provides built-in agents that emulate human behaviors and an interface for human players.
The humanlike agent represents a built-in agent in VirtualHome-Social. It plans its actions based on a goal and its observation of the environment. During the Help stage, the AI agent receives observations from the system at each step and sends an action command back to control a virtual avatar.
The humanlike agent – which can also be controlled by a human – updates its plan based on its latest observation to reflect any state change caused by the AI agent.
The researchers designed an evaluation protocol and provided benchmarks for WAH, including a goal model for the Watch stage and multiple planning and machine learning baselines for the Help stage.
The team says results indicate that to achieve success in WAH, AI agents must acquire strong social perception and generalizable helping strategies – as hypothesized.
“Our ultimate goal is to build AI agents that can work with real humans. Our platform opens up exciting directions of future work, such as online goal inference and direct communication between agents,” the researchers wrote.
“We hope that the proposed challenge and virtual environment can promote future research on building more sophisticated machine social intelligence.”