Adaptive Task Allocation in Multi-Human Multi-Robot Teams under Team Heterogeneity and Dynamic Information Uncertainty

Ziqin Yuan; Ruiqi Wang; Taehyeon Kim; Dezhong Zhao; Ike Obi; Byung-Cheol Min

Adaptive Task Allocation in Multi-Human Multi-Robot Teams under Team Heterogeneity and Dynamic Information Uncertainty

Ziqin Yuan, Ruiqi Wang, Taehyeon Kim, Dezhong Zhao, Ike Obi, Byung-Cheol Min

TL;DR

The paper tackles task allocation in multi-human multi-robot teams under inherent heterogeneity, dynamic operation states, and information uncertainty. It introduces ATA-HRL, a two-level hierarchical reinforcement learning framework combining an Initial Task Assignment (ITA) and Conditional Task Reallocation (CTR), augmented by an auxiliary state representation learning module to reconstruct uncertain observations. The approach is formalized as a hierarchical Markov decision process with tailored reward structures for ITA, CTR trigger, and reallocation actions, and is trained using a combination of policy learning and auxiliary losses (cVAEKL and GRU-based latency supervision). A large-scale environmental monitoring case study demonstrates that ATA-HRL yields superior performance and robustness compared to state-of-the-art baselines, particularly as uncertainty increases, indicating strong practical potential for real-world MH-MR deployments.

Abstract

Task allocation in multi-human multi-robot (MH-MR) teams presents significant challenges due to the inherent heterogeneity of team members, the dynamics of task execution, and the information uncertainty of operational states. Existing approaches often fail to address these challenges simultaneously, resulting in suboptimal performance. To tackle this, we propose ATA-HRL, an adaptive task allocation framework using hierarchical reinforcement learning (HRL), which incorporates initial task allocation (ITA) that leverages team heterogeneity and conditional task reallocation in response to dynamic operational states. Additionally, we introduce an auxiliary state representation learning task to manage information uncertainty and enhance task execution. Through an extensive case study in large-scale environmental monitoring tasks, we demonstrate the benefits of our approach.

Adaptive Task Allocation in Multi-Human Multi-Robot Teams under Team Heterogeneity and Dynamic Information Uncertainty

TL;DR

Abstract

Paper Structure (18 sections, 10 equations, 5 figures, 1 table)

This paper contains 18 sections, 10 equations, 5 figures, 1 table.

Introduction
Background
Methodology
Problem Formulation
Initial Task Assignment
Conditional Task Reallocation with State Reconstruction
Reward Shaping and Model Training
Case Study and Experiments
Task Scenario
Simulation Environment
Robot Model
Human Model
Modeling Information Uncertainty
Experiments and Results
Baselines and Ablation Models
...and 3 more sections

Figures (5)

Figure 1: Conceptual illustration of our adaptive task allocation method, named ATA-HRL, in MH-MR teams. Unlike previous one-sided approaches, we consider both inherent heterogeneity and in-process dynamic states of the team and its assigned tasks, hierarchically combining initial task allocation and conditional task reallocation. To handle state information uncertainty, we also introduce an auxiliary state learning task to contextually reconstruct incomplete or noisy state information.
Figure 2: Illustration of the proposed ATA-HRL framework. The main HRL hierarchy consists of two levels: the first, at time step $0$, determines the optimal ITA by considering inherent team heterogeneity; the second, at each subsequent time step $1-n$ during operation, decides whether to reallocate tasks and how to allocate them, considering additional dynamic operational changes. The optional reallocation decision is represented by a switch icon. An auxiliary state learning module is integrated into the second layer to address state information uncertainty, enhancing decision-making during reallocation.
Figure 3: Detailed structure of the Conditional VAE (left) and GRU-based (right) state reconstruction framework.
Figure 4: Visual illustration of the simulation environment, zoomed for visibility. Each POI is distinguished by color to indicate the complexity level for hazard evaluation. Additionally, the type of pollution at each POI is determined by the building type, with warehouses representing ground POIs and factories representing air POIs.
Figure 5: Comparison of ATA-HRL with baselines and ablation models in the setting of with 6 humans, 8 robots, and 60 POIs (left) and setting of with 12 humans, 16 robots, and 130 POIs (middle), and ablation study results (right).

Adaptive Task Allocation in Multi-Human Multi-Robot Teams under Team Heterogeneity and Dynamic Information Uncertainty

TL;DR

Abstract

Adaptive Task Allocation in Multi-Human Multi-Robot Teams under Team Heterogeneity and Dynamic Information Uncertainty

Authors

TL;DR

Abstract

Table of Contents

Figures (5)