Table of Contents
Fetching ...

Hierarchical LLMs In-the-Loop Optimization for Real-Time Multi-Robot Target Tracking under Unknown Hazards

Yuwei Wu, Yuezhan Tao, Peihan Li, Guangyao Shi, Gaurav S. Sukhatme, Vijay Kumar, Lifeng Zhou

TL;DR

The paper tackles real-time, risk-aware multi-robot target tracking in unknown hazardous environments where sensor and communication attacks can degrade performance. It introduces a bi-level optimization framework in which an outer Task LLM reconfigures task assignments and an inner Action LLM provides rapid performance adaptations, all guided by a centralized optimizer and optionally a human supervisor; the outer/inner loop outputs are integrated with constraints through the objective $J(A,u)$ and feasibility conditions on $H(A,u)$ and $G(A,u)$. Key contributions include a novel hierarchical LLM-in-the-loop design, prompt-design and output-verification mechanisms (e.g., $eta(A)$) to ensure constraint adherence, and thorough validation in both simulation with multiple LLMs and real hardware experiments. The results demonstrate improved tracking performance, robustness to hazards, and real-time feasibility, highlighting the potential of safety-aware LLM-assisted coordination for complex, real-world robotic teams.

Abstract

Real-time multi-robot coordination in hazardous and adversarial environments requires fast, reliable adaptation to dynamic threats. While Large Language Models (LLMs) offer strong high-level reasoning capabilities, the lack of safety guarantees limits their direct use in critical decision-making. In this paper, we propose a hierarchical optimization framework that integrates LLMs into the decision loop for multi-robot target tracking in dynamic and hazardous environments. Rather than generating control actions directly, LLMs are used to generate task configuration and adjust parameters in a bi-level task allocation and planning problem. We formulate multi-robot coordination for tracking tasks as a bi-level optimization problem, with LLMs to reason about potential hazards in the environment and the status of the robot team and modify both the inner and outer levels of the optimization. This hierarchical approach enables real-time adjustments to the robots' behavior. Additionally, a human supervisor can offer broad guidance and assessments to address unexpected dangers, model mismatches, and performance issues arising from local minima. We validate our proposed framework in both simulation and real-world experiments with comprehensive evaluations, demonstrating its effectiveness and showcasing its capability for safe LLM integration for multi-robot systems.

Hierarchical LLMs In-the-Loop Optimization for Real-Time Multi-Robot Target Tracking under Unknown Hazards

TL;DR

The paper tackles real-time, risk-aware multi-robot target tracking in unknown hazardous environments where sensor and communication attacks can degrade performance. It introduces a bi-level optimization framework in which an outer Task LLM reconfigures task assignments and an inner Action LLM provides rapid performance adaptations, all guided by a centralized optimizer and optionally a human supervisor; the outer/inner loop outputs are integrated with constraints through the objective and feasibility conditions on and . Key contributions include a novel hierarchical LLM-in-the-loop design, prompt-design and output-verification mechanisms (e.g., ) to ensure constraint adherence, and thorough validation in both simulation with multiple LLMs and real hardware experiments. The results demonstrate improved tracking performance, robustness to hazards, and real-time feasibility, highlighting the potential of safety-aware LLM-assisted coordination for complex, real-world robotic teams.

Abstract

Real-time multi-robot coordination in hazardous and adversarial environments requires fast, reliable adaptation to dynamic threats. While Large Language Models (LLMs) offer strong high-level reasoning capabilities, the lack of safety guarantees limits their direct use in critical decision-making. In this paper, we propose a hierarchical optimization framework that integrates LLMs into the decision loop for multi-robot target tracking in dynamic and hazardous environments. Rather than generating control actions directly, LLMs are used to generate task configuration and adjust parameters in a bi-level task allocation and planning problem. We formulate multi-robot coordination for tracking tasks as a bi-level optimization problem, with LLMs to reason about potential hazards in the environment and the status of the robot team and modify both the inner and outer levels of the optimization. This hierarchical approach enables real-time adjustments to the robots' behavior. Additionally, a human supervisor can offer broad guidance and assessments to address unexpected dangers, model mismatches, and performance issues arising from local minima. We validate our proposed framework in both simulation and real-world experiments with comprehensive evaluations, demonstrating its effectiveness and showcasing its capability for safe LLM integration for multi-robot systems.
Paper Structure (19 sections, 4 equations, 5 figures, 2 tables)

This paper contains 19 sections, 4 equations, 5 figures, 2 tables.

Figures (5)

  • Figure 1: Real-world experiments. Two drones track ground robots in an environment with two sensing zones (red) and one communication zone (blue), guided by human supervision to mitigate risks.
  • Figure 2: The hierarchical LLM framework for multi-robot tracking processes both robot and human inputs through two LLMs: a high-frequency action LLM for reactive performance adaptation, and a low-frequency task LLM to provide strategic guidance and reconfiguration to the system. The optimization solver takes the LLMs' outputs into executable actions, guiding the multi-robot team in real-time.
  • Figure 3: Demonstration of multi-robot target tracking with sensing and communication danger zones. Gray circles indicate undetected zones, red and blue circles are sense and communication zones after detection. The robot team consists of five robots tracking seven targets in an environment with four danger zones. In temporal order, (a–b) show a robot entering a sensing zone, being attacked, and recovering. (c–d) illustrate a communication failure and subsequent recovery.
  • Figure 4: Success rate and token count generated by different models for the Task LLM (a) and Action LLM (b), evaluated under varying task loads and environmental complexities.
  • Figure 5: Hardware experiment of multi-drone target tracking with various danger zones (red disks represent sensing danger zones, and blue disks represent communication danger zones).