"Why the face?": Exploring Robot Error Detection Using Instrumented Bystander Reactions

Maria Teresa Parreira; Ruidong Zhang; Sukruth Gowdru Lingaraju; Alexandra Bremers; Xuanyu Fang; Adolfo Ramirez-Aristizabal; Manaswi Saha; Michael Kuniavsky; Cheng Zhang; Wendy Ju

"Why the face?": Exploring Robot Error Detection Using Instrumented Bystander Reactions

Maria Teresa Parreira, Ruidong Zhang, Sukruth Gowdru Lingaraju, Alexandra Bremers, Xuanyu Fang, Adolfo Ramirez-Aristizabal, Manaswi Saha, Michael Kuniavsky, Cheng Zhang, Wendy Ju

TL;DR

The study addresses how robots can better detect and adapt to human reactions to errors by leveraging a novel neck-mounted device (NeckFace) that captures chin-region expressions. It introduces NeckNet-18 to map IR-camera data to 3D facial expressions and builds error-detection models trained on NeckFace-derived signals, outperforming OpenFace and frame-based baselines, especially in within-participant settings. The findings support expanding human-in-the-loop sensing in HRI and demonstrate that 3D reaction data can yield robust, personalized error detection with potential for real-time robotic adaptation. Overall, the work advances social cue detection in robotics and motivates broader adoption of wearable, mobile sensing for context-aware human–robot collaboration.

Abstract

How do humans recognize and rectify social missteps? We achieve social competence by looking around at our peers, decoding subtle cues from bystanders - a raised eyebrow, a laugh - to evaluate the environment and our actions. Robots, however, struggle to perceive and make use of these nuanced reactions. By employing a novel neck-mounted device that records facial expressions from the chin region, we explore the potential of previously untapped data to capture and interpret human responses to robot error. First, we develop NeckNet-18, a 3D facial reconstruction model to map the reactions captured through the chin camera onto facial points and head motion. We then use these facial responses to develop a robot error detection model which outperforms standard methodologies such as using OpenFace or video data, generalizing well especially for within-participant data. Through this work, we argue for expanding human-in-the-loop robot sensing, fostering more seamless integration of robots into diverse human environments, pushing the boundaries of social cue detection and opening new avenues for adaptable robotics.

"Why the face?": Exploring Robot Error Detection Using Instrumented Bystander Reactions

TL;DR

Abstract

"Why the face?": Exploring Robot Error Detection Using Instrumented Bystander Reactions

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (3)