Tobias Fischer, and Yiannis Demiris
Imperial College London
Personal Robotics Lab
Perspective taking enables humans to imagine the world from another viewpoint. This allows reasoning about the state of other agents, which in turn is used to more accurately predict their behavior. In this paper, we equip an iCub humanoid robot with the ability to perform visuospatial perspective taking (PT) using a single depth camera mounted above the robot. Our approach has the distinct benefit that the robot can be used in unconstrained environments, as opposed to previous works which employ marker-based motion capture systems. Prior to and during the PT, the iCub learns the environment, recognizes objects within the environment, and estimates the gaze of surrounding humans. We propose a new head pose estimation algorithm which shows a performance boost by normalizing the depth data to be aligned with the human head. Inspired by psychological studies, we employ two separate mechanisms for the two different types of PT. We implement line of sight tracing to determine whether an object is visible to the humans (level 1 PT). For more complex PT tasks (level 2 PT), the acquired point cloud is mentally rotated, which allows algorithms to reason as if the input data was acquired from an egocentric perspective. We show that this can be used to better judge where object are in relation to the humans. The multifaceted improvements to the PT pipeline advance the state of the art, and move PT in robots to markerless, unconstrained environments.
If you want to use the source code, you need to clone the WYSIWYD repository.