Learning Efficient Flocking Control based on Gibbs Random Fields

Dengyu Zhang; Chenghao; Feng Xue; Qingrui Zhang

Learning Efficient Flocking Control based on Gibbs Random Fields

Dengyu Zhang, Chenghao, Feng Xue, Qingrui Zhang

TL;DR

The paper addresses scalable, safe, and efficient distributed flocking for multi-robot systems in congested environments by formulating flocking as a GRF-based MARL problem. It introduces a decentralized training/execution (DTDE) scheme through GRF-based credit assignment, and an action attention module that enables implicit motion-intention anticipation via mean-field-inspired attention. A structured energy-based reward $r=\exp[-H(X)]$ combining unary and pairwise terms guides learning, while local rewards preserve global optima through a decoupled pairwise energy $\hat{H}_p$ and a PPO-based policy optimization. Results show $\approx 99\%$ success in simulations and real-world experiments, with ablation studies confirming the value of credit assignment and action attention for performance and safety.

Abstract

Flocking control is essential for multi-robot systems in diverse applications, yet achieving efficient flocking in congested environments poses challenges regarding computation burdens, performance optimality, and motion safety. This paper addresses these challenges through a multi-agent reinforcement learning (MARL) framework built on Gibbs Random Fields (GRFs). With GRFs, a multi-robot system is represented by a set of random variables conforming to a joint probability distribution, thus offering a fresh perspective on flocking reward design. A decentralized training and execution mechanism, which enhances the scalability of MARL concerning robot quantity, is realized using a GRF-based credit assignment method. An action attention module is introduced to implicitly anticipate the motion intentions of neighboring robots, consequently mitigating potential non-stationarity issues in MARL. The proposed framework enables learning an efficient distributed control policy for multi-robot systems in challenging environments with success rate around $99\%$, as demonstrated through thorough comparisons with state-of-the-art solutions in simulations and experiments. Ablation studies are also performed to validate the efficiency of different framework modules.

Learning Efficient Flocking Control based on Gibbs Random Fields

TL;DR

Abstract

Learning Efficient Flocking Control based on Gibbs Random Fields

TL;DR

Abstract

Paper Structure

Table of Contents

Key Result

Figures (10)

Theorems & Definitions (2)