Reactive Multi-Robot Navigation in Outdoor Environments Through Uncertainty-Aware Active Learning of Human Preference Landscape
Chao Huang, Wenshuo Zang, Carlo Pinciroli, Zhi Jane Li, Taposh Banerjee, Lili Su, Rui Liu
TL;DR
The paper tackles robust, real-time navigation for multi-robot systems in outdoor, unstructured environments where human preferences are noisy and context-dependent. It introduces PLBA, a joint preference landscape learning and behavior-adjusting framework that combines multi-output Gaussian Processes with varying output noise and an active-learning loop to efficiently infer human preferences $f( ext{·})=[f^{(1)}( ext{·}),...,f^{(d)}( ext{·})]^T$ and their uncertainty, yielding MAP estimates $\\hat{f}( ext{·})$. The behavior adjustment component uses an optimization-based planner with a composite velocity model $v_i = v_i^{flock}+v_i^{fmt}+v_i^{rep}+v_i^{att}+v_i^{saf}+v_i^{hei}+v_i^{ali}$ under a speed cap $h_{speed}$ and safety constraints, guided by active queries when predictive covariance is high. Evaluation on a flood-disaster scenario with 20 human participants producing 1764 feedback samples demonstrates faster, more accurate preference learning and safer, more effective MRS adaptation than baseline regression methods. The work advances autonomous, uncertainty-aware human-robot teaming in unstructured outdoor spaces and suggests directions for onboard sensing and broader multi-robot collaboration.
Abstract
Compared with single robots, Multi-Robot Systems (MRS) can perform missions more efficiently due to the presence of multiple members with diverse capabilities. However, deploying an MRS in wide real-world environments is still challenging due to uncertain and various obstacles (e.g., building clusters and trees). With a limited understanding of environmental uncertainty on performance, an MRS cannot flexibly adjust its behaviors (e.g., teaming, load sharing, trajectory planning) to ensure both environment adaptation and task accomplishments. In this work, a novel joint preference landscape learning and behavior adjusting framework (PLBA) is designed. PLBA efficiently integrates real-time human guidance to MRS coordination and utilizes Sparse Variational Gaussian Processes with Varying Output Noise to quickly assess human preferences by leveraging spatial correlations between environment characteristics. An optimization-based behavior-adjusting method then safely adapts MRS behaviors to environments. To validate PLBA's effectiveness in MRS behavior adaption, a flood disaster search and rescue task was designed. 20 human users provided 1764 feedback based on human preferences obtained from MRS behaviors related to "task quality", "task progress", "robot safety". The prediction accuracy and adaptation speed results show the effectiveness of PLBA in preference learning and MRS behavior adaption.
