Indicating Robot Vision Capabilities with Augmented Reality
Hong Wang, Ridhima Phatak, James Ocampo, Zhao Han
TL;DR
This work tackles the misalignment between human mental models and robot vision by introducing four augmented reality FoV indicators that range from egocentric to allocentric designs. Through a Bayesian-guided, 1×5 human-subject experiment with Pepper and HoloLens 2, the study demonstrates that allocentric indicators in the task space yield the highest accuracy, with Eye Socket designs also improving correctness and overall confidence remaining high while workload stays low. The authors provide six practical guidelines for applying AR indicators or physical alterations to align human expectations with robots' actual FoV, highlighting implications for transparency and coordination in human-robot collaboration. The findings inform AR-based transparency design and offer a path toward improved, task-relevant human–robot communication in real-world settings.
Abstract
Research indicates that humans can mistakenly assume that robots and humans have the same field of view (FoV), possessing an inaccurate mental model of robots. This misperception may lead to failures during human-robot collaboration tasks where robots might be asked to complete impossible tasks about out-of-view objects. The issue is more severe when robots do not have a chance to scan the scene to update their world model while focusing on assigned tasks. To help align humans' mental models of robots' vision capabilities, we propose four FoV indicators in augmented reality (AR) and conducted a user human-subjects experiment (N=41) to evaluate them in terms of accuracy, confidence, task efficiency, and workload. These indicators span a spectrum from egocentric (robot's eye and head space) to allocentric (task space). Results showed that the allocentric blocks at the task space had the highest accuracy with a delay in interpreting the robot's FoV. The egocentric indicator of deeper eye sockets, possible for physical alteration, also increased accuracy. In all indicators, participants' confidence was high while cognitive load remained low. Finally, we contribute six guidelines for practitioners to apply our AR indicators or physical alterations to align humans' mental models with robots' vision capabilities.
