"It Must Be Gesturing Towards Me": Gesture-Based Interaction between Autonomous Vehicles and Pedestrians
Xiang Chang, Zihe Chen, Xiaoyan Dong, Yuxin Cai, Tingmin Yan, Haolin Cai, Zherui Zhou, Guyue Zhou, Jiangtao Gong
TL;DR
This work tackles the challenge of enabling pedestrians to understand autonomous vehicle intentions through gesture-based external HMI (eHMI) designs inspired by human driver gestures. The authors combine a controlled VR study (N=31) with a large online survey (N=394) to compare eight gesture-based eHMIs (four yielding, four non-yielding) against state-of-the-art and no-eHMI baselines, evaluating clarity, familiarity, politeness, and safety-related metrics. They find that well-chosen gesture sets improve comprehension, reduce hesitation, and lower perceived danger, though some gestures cause ambiguity or misinterpretation, highlighting the importance of recipient clarity and semantic design. The study provides actionable insights for gesture selection, learning dynamics, and age-related preferences, suggesting that gesture-based eHMIs can enhance pedestrian trust and interaction efficiency with AVs, while also outlining limitations and directions for real-world validation and multimodal integration.
Abstract
Interacting with pedestrians understandably and efficiently is one of the toughest challenges faced by autonomous vehicles (AVs) due to the limitations of current algorithms and external human-machine interfaces (eHMIs). In this paper, we design eHMIs based on gestures inspired by the most popular method of interaction between pedestrians and human drivers. Eight common gestures were selected to convey AVs' yielding or non-yielding intentions at uncontrolled crosswalks from previous literature. Through a VR experiment (N1 = 31) and a following online survey (N2 = 394), we discovered significant differences in the usability of gesture-based eHMIs compared to current eHMIs. Good gesture-based eHMIs increase the efficiency of pedestrian-AV interaction while ensuring safety. Poor gestures, however, cause misinterpretation. The underlying reasons were explored: ambiguity regarding the recipient of the signal and whether the gestures are precise, polite, and familiar to pedestrians. Based on this empirical evidence, we discuss potential opportunities and provide valuable insights into developing comprehensible gesture-based eHMIs in the future to support better interaction between AVs and other road users.
