Table of Contents
Fetching ...

Indicating Robot Vision Capabilities with Augmented Reality

Hong Wang, Ridhima Phatak, James Ocampo, Zhao Han

TL;DR

This work tackles the misalignment between human mental models and robot vision by introducing four augmented reality FoV indicators that range from egocentric to allocentric designs. Through a Bayesian-guided, 1×5 human-subject experiment with Pepper and HoloLens 2, the study demonstrates that allocentric indicators in the task space yield the highest accuracy, with Eye Socket designs also improving correctness and overall confidence remaining high while workload stays low. The authors provide six practical guidelines for applying AR indicators or physical alterations to align human expectations with robots' actual FoV, highlighting implications for transparency and coordination in human-robot collaboration. The findings inform AR-based transparency design and offer a path toward improved, task-relevant human–robot communication in real-world settings.

Abstract

Research indicates that humans can mistakenly assume that robots and humans have the same field of view (FoV), possessing an inaccurate mental model of robots. This misperception may lead to failures during human-robot collaboration tasks where robots might be asked to complete impossible tasks about out-of-view objects. The issue is more severe when robots do not have a chance to scan the scene to update their world model while focusing on assigned tasks. To help align humans' mental models of robots' vision capabilities, we propose four FoV indicators in augmented reality (AR) and conducted a user human-subjects experiment (N=41) to evaluate them in terms of accuracy, confidence, task efficiency, and workload. These indicators span a spectrum from egocentric (robot's eye and head space) to allocentric (task space). Results showed that the allocentric blocks at the task space had the highest accuracy with a delay in interpreting the robot's FoV. The egocentric indicator of deeper eye sockets, possible for physical alteration, also increased accuracy. In all indicators, participants' confidence was high while cognitive load remained low. Finally, we contribute six guidelines for practitioners to apply our AR indicators or physical alterations to align humans' mental models with robots' vision capabilities.

Indicating Robot Vision Capabilities with Augmented Reality

TL;DR

This work tackles the misalignment between human mental models and robot vision by introducing four augmented reality FoV indicators that range from egocentric to allocentric designs. Through a Bayesian-guided, 1×5 human-subject experiment with Pepper and HoloLens 2, the study demonstrates that allocentric indicators in the task space yield the highest accuracy, with Eye Socket designs also improving correctness and overall confidence remaining high while workload stays low. The authors provide six practical guidelines for applying AR indicators or physical alterations to align human expectations with robots' actual FoV, highlighting implications for transparency and coordination in human-robot collaboration. The findings inform AR-based transparency design and offer a path toward improved, task-relevant human–robot communication in real-world settings.

Abstract

Research indicates that humans can mistakenly assume that robots and humans have the same field of view (FoV), possessing an inaccurate mental model of robots. This misperception may lead to failures during human-robot collaboration tasks where robots might be asked to complete impossible tasks about out-of-view objects. The issue is more severe when robots do not have a chance to scan the scene to update their world model while focusing on assigned tasks. To help align humans' mental models of robots' vision capabilities, we propose four FoV indicators in augmented reality (AR) and conducted a user human-subjects experiment (N=41) to evaluate them in terms of accuracy, confidence, task efficiency, and workload. These indicators span a spectrum from egocentric (robot's eye and head space) to allocentric (task space). Results showed that the allocentric blocks at the task space had the highest accuracy with a delay in interpreting the robot's FoV. The egocentric indicator of deeper eye sockets, possible for physical alteration, also increased accuracy. In all indicators, participants' confidence was high while cognitive load remained low. Finally, we contribute six guidelines for practitioners to apply our AR indicators or physical alterations to align humans' mental models with robots' vision capabilities.

Paper Structure

This paper contains 33 sections, 21 figures, 2 tables.

Figures (21)

  • Figure 1: To indicate a robot's vision capability, i.e., the field of view (FoV), we propose four egocentric and allocentric indicators in augmented reality (AR) and evaluated them in a user study with Baseline (no indicators). The design philosophy--from the eyes/head to task space--and descriptions of each design are detailed in Section \ref{['sec:taxonomy']} and \ref{['sec:designs']}.
  • Figure 2: Our design spectrum: from the robot (eyes to head) towards the task environment, and vice versa.
  • Figure 3: The toolkit used in our collaborative task. (Product photo toolkit used under Fair Use.)
  • Figure 4: Experiment setup. Left Task Table: Two pre-assembled parts for participants to start building the airplane model. Participants sat approximately 3.3 meters away so they could see the indicators in full. Right Robot Table: Objects within the robot's reach and needed to finish assembly.
  • Figure 5: Four assembly steps to build the airplane model.
  • ...and 16 more figures