Reactive Multi-Robot Navigation in Outdoor Environments Through Uncertainty-Aware Active Learning of Human Preference Landscape

Chao Huang; Wenshuo Zang; Carlo Pinciroli; Zhi Jane Li; Taposh Banerjee; Lili Su; Rui Liu

Reactive Multi-Robot Navigation in Outdoor Environments Through Uncertainty-Aware Active Learning of Human Preference Landscape

Chao Huang, Wenshuo Zang, Carlo Pinciroli, Zhi Jane Li, Taposh Banerjee, Lili Su, Rui Liu

TL;DR

The paper tackles robust, real-time navigation for multi-robot systems in outdoor, unstructured environments where human preferences are noisy and context-dependent. It introduces PLBA, a joint preference landscape learning and behavior-adjusting framework that combines multi-output Gaussian Processes with varying output noise and an active-learning loop to efficiently infer human preferences $f( ext{·})=[f^{(1)}( ext{·}),...,f^{(d)}( ext{·})]^T$ and their uncertainty, yielding MAP estimates $\\hat{f}( ext{·})$. The behavior adjustment component uses an optimization-based planner with a composite velocity model $v_i = v_i^{flock}+v_i^{fmt}+v_i^{rep}+v_i^{att}+v_i^{saf}+v_i^{hei}+v_i^{ali}$ under a speed cap $h_{speed}$ and safety constraints, guided by active queries when predictive covariance is high. Evaluation on a flood-disaster scenario with 20 human participants producing 1764 feedback samples demonstrates faster, more accurate preference learning and safer, more effective MRS adaptation than baseline regression methods. The work advances autonomous, uncertainty-aware human-robot teaming in unstructured outdoor spaces and suggests directions for onboard sensing and broader multi-robot collaboration.

Abstract

Compared with single robots, Multi-Robot Systems (MRS) can perform missions more efficiently due to the presence of multiple members with diverse capabilities. However, deploying an MRS in wide real-world environments is still challenging due to uncertain and various obstacles (e.g., building clusters and trees). With a limited understanding of environmental uncertainty on performance, an MRS cannot flexibly adjust its behaviors (e.g., teaming, load sharing, trajectory planning) to ensure both environment adaptation and task accomplishments. In this work, a novel joint preference landscape learning and behavior adjusting framework (PLBA) is designed. PLBA efficiently integrates real-time human guidance to MRS coordination and utilizes Sparse Variational Gaussian Processes with Varying Output Noise to quickly assess human preferences by leveraging spatial correlations between environment characteristics. An optimization-based behavior-adjusting method then safely adapts MRS behaviors to environments. To validate PLBA's effectiveness in MRS behavior adaption, a flood disaster search and rescue task was designed. 20 human users provided 1764 feedback based on human preferences obtained from MRS behaviors related to "task quality", "task progress", "robot safety". The prediction accuracy and adaptation speed results show the effectiveness of PLBA in preference learning and MRS behavior adaption.

Reactive Multi-Robot Navigation in Outdoor Environments Through Uncertainty-Aware Active Learning of Human Preference Landscape

TL;DR

and their uncertainty, yielding MAP estimates

. The behavior adjustment component uses an optimization-based planner with a composite velocity model

under a speed cap

and safety constraints, guided by active queries when predictive covariance is high. Evaluation on a flood-disaster scenario with 20 human participants producing 1764 feedback samples demonstrates faster, more accurate preference learning and safer, more effective MRS adaptation than baseline regression methods. The work advances autonomous, uncertainty-aware human-robot teaming in unstructured outdoor spaces and suggests directions for onboard sensing and broader multi-robot collaboration.

Abstract

Paper Structure (10 sections, 11 equations, 7 figures, 1 algorithm)

This paper contains 10 sections, 11 equations, 7 figures, 1 algorithm.

Introduction
Related Work
Methods
Problem statement
Multi-output GPs with varying output noise
Preference-based safe MRS behavior adjustment
Evaluation
Experiment setting
Result analysis
Conclusion and Future Work

Figures (7)

Figure 1: Illustration of a simulated flood disaster site. When deployed in a flood disaster response scenario, an MRS adapts to various environments, such as cluttered, structured, and open space environments, whose characteristics cannot be fully determined before actual deployment.
Figure 2: PLBA workflow. The left part is the active learning framework based on preference learning. The right part is the unstructured environment adaptation process during MRS deployments. (a) preference prediction for environments, (b) human preference-based MRS behavior adaptation planning, (c) MRS behavior adjustment and preference model update.
Figure 3: Illustration of experiment setting, including a flood disaster site, human preference (the width of trajectory is proportional to the ratio of three aspects of human preference), and multi-robot flocking behaviors.
Figure 4: Preferred robot behavior illustration for three typical environments "Cluttered", "Regular" and "Clearance".
Figure 5: Human preference values for eight typical environments over five kinds of robot flocking behaviors ("CA": coverage area/m, "FH": flying height/m, "SD": safe distance to the obstacle/m, "FS": flying speed/(m/s), and "TF": team formation). Green represents the mean values of human preference, and red denotes the variance of human preferences.
...and 2 more figures

Reactive Multi-Robot Navigation in Outdoor Environments Through Uncertainty-Aware Active Learning of Human Preference Landscape

TL;DR

Abstract

Reactive Multi-Robot Navigation in Outdoor Environments Through Uncertainty-Aware Active Learning of Human Preference Landscape

Authors

TL;DR

Abstract

Table of Contents

Figures (7)