Constrained Learning for Decentralized Multi-Objective Coverage Control

Juan Cervino; Saurav Agarwal; Vijay Kumar; Alejandro Ribeiro

Constrained Learning for Decentralized Multi-Objective Coverage Control

Juan Cervino, Saurav Agarwal, Vijay Kumar, Alejandro Ribeiro

TL;DR

This work addresses decentralized, multi-objective coverage control for robot swarms operating over multiple importance density fields (IDFs). It reformulates the problem using duality so that the dual objective becomes a linear combination of IDFs, enabling a Learnable Perception-Action-Communication (LPAC) policy to serve as a primal solver for a single combined objective. The LPAC architecture integrates CNN-based perception, Graph Neural Network-based communication, and an MLP-based action head, and is trained via imitation learning against a clairvoyant CVT controller. Empirically, the method achieves about a 30% average improvement over state-of-the-art decentralized controllers, scales to larger environments and more robots, and transfers across varying numbers of IDFs, demonstrating strong practical impact for decentralized, constraint-aware coverage in multi-robot systems.

Abstract

The multi-objective coverage control problem requires a robot swarm to collaboratively provide sensor coverage to multiple heterogeneous importance density fields IDFs simultaneously. We pose this as an optimization problem with constraints and study two different formulations: (1) Fair coverage, where we minimize the maximum coverage cost for any field, promoting equitable resource distribution among all fields; and (2) Constrained coverage, where each field must be covered below a certain cost threshold, ensuring that critical areas receive adequate coverage according to predefined importance levels. We study the decentralized setting where robots have limited communication and local sensing capabilities, making the system more realistic, scalable, and robust. Given the complexity, we propose a novel decentralized constrained learning approach that combines primal-dual optimization with a Learnable Perception-Action-Communication (LPAC) neural network architecture. We show that the Lagrangian of the dual problem can be reformulated as a linear combination of the IDFs, enabling the LPAC policy to serve as a primal solver. We empirically demonstrate that the proposed method (i) significantly outperforms state-of-the-art decentralized controllers by 30% on average in terms of coverage cost, (ii) transfers well to larger environments with more robots, and (iii) scalable in the number of IDFs and robots in the swarm.

Constrained Learning for Decentralized Multi-Objective Coverage Control

TL;DR

Abstract

Paper Structure (10 sections, 2 theorems, 13 equations, 6 figures, 1 table, 1 algorithm)

This paper contains 10 sections, 2 theorems, 13 equations, 6 figures, 1 table, 1 algorithm.

Introduction
Related Work
Decentralized Multi-Objective Coverage
Multi-Objective Coverage
Approach: Primal-Dual and LPAC Loops
Primal-Dual Algorithm For Multi-Objective Coverage
Perception-Action-Communication Loops
Imitation Learning
Experiments
Conclusions

Key Result

Proposition 1

Given a set of scalars $\lambda_m\geq 0,m=1,\dots, M$, the linear combination of coverage control problems $\sum_{m=1}^M\lambda_m\mathcal{J}_{\phi_m}\left(\mathbf{X}\left(t\right)\right)$, is equivalent to a coverage control problem on the linear combinations of IDFs, i.e., $\mathcal{J}_{\phi_\lambd

Figures (6)

Figure 1: Multi-objective coverage control on an environment with four importance density fields (IDFs), shown in different colors. Robots make localized observations based on their limited sensing range (middle). The right figure shows the cumulative observations of all robots, their positions, and the communication graph. The proposed primal-dual algorithm performs dual updates, which are used to re-weight the IDFs, and the LPAC policy computes the velocity actions for the robots.
Figure 2: Fair coverage control problem with $32$ robots and $4$ IDFs in a $1024\times 1024$$\,\text{m}^{2}$ environment. The coverage cost increase in comparison to the clairvoyant is $81\%$, $246\%$, $250\%$, and $258\%$ for LPAC, Centralized, Decentralized, and SFCC malenciafair respectively. The LPAC policy outperforms the centralized CVT and SFCC approaches.
Figure 3: Generalization study for the fair coverage problem for varying communication radius (192, 256, 320) and sensor size (64, 96, 128), averaged over $100$ environments. The LPAC consistently outperforms the centralized CVT for all configurations.
Figure 4: Case study: Correlation of maximum coverage cost (top row) and dual variables (bottom row) per IDF for the environment shown in \ref{['fig:main_env']}. Each color represents a different IDF. The primal-dual algorithm performs dual updates such that the least covered IDF has more weight. The LPAC policy drives the robots efficiently to reduce the coverage cost.
Figure 5: Scalability study for the Fair Coverage problem: The plots show the coverage cost ratio of LPAC to Centralized CVT (C-CVT) on the $y$-axis. The simulations are executed for $500$ time steps of $0.5 s$ each and dual updates every $25$ steps. The same model is used for all experiments without further training. The LPAC consistently outperforms C-CVT across all experiments. While the performance reduces when the number of robots $n$ is scaled down, the LPAC performs significantly better as $n$ scales up. The performance for varying numbers of IDFs and environment sizes remains relatively stable.
...and 1 more figures

Theorems & Definitions (4)

Proposition 1
proof
Corollary 2
proof

Constrained Learning for Decentralized Multi-Objective Coverage Control

TL;DR

Abstract

Constrained Learning for Decentralized Multi-Objective Coverage Control

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (6)

Theorems & Definitions (4)