Constrained Learning for Decentralized Multi-Objective Coverage Control
Juan Cervino, Saurav Agarwal, Vijay Kumar, Alejandro Ribeiro
TL;DR
This work addresses decentralized, multi-objective coverage control for robot swarms operating over multiple importance density fields (IDFs). It reformulates the problem using duality so that the dual objective becomes a linear combination of IDFs, enabling a Learnable Perception-Action-Communication (LPAC) policy to serve as a primal solver for a single combined objective. The LPAC architecture integrates CNN-based perception, Graph Neural Network-based communication, and an MLP-based action head, and is trained via imitation learning against a clairvoyant CVT controller. Empirically, the method achieves about a 30% average improvement over state-of-the-art decentralized controllers, scales to larger environments and more robots, and transfers across varying numbers of IDFs, demonstrating strong practical impact for decentralized, constraint-aware coverage in multi-robot systems.
Abstract
The multi-objective coverage control problem requires a robot swarm to collaboratively provide sensor coverage to multiple heterogeneous importance density fields IDFs simultaneously. We pose this as an optimization problem with constraints and study two different formulations: (1) Fair coverage, where we minimize the maximum coverage cost for any field, promoting equitable resource distribution among all fields; and (2) Constrained coverage, where each field must be covered below a certain cost threshold, ensuring that critical areas receive adequate coverage according to predefined importance levels. We study the decentralized setting where robots have limited communication and local sensing capabilities, making the system more realistic, scalable, and robust. Given the complexity, we propose a novel decentralized constrained learning approach that combines primal-dual optimization with a Learnable Perception-Action-Communication (LPAC) neural network architecture. We show that the Lagrangian of the dual problem can be reformulated as a linear combination of the IDFs, enabling the LPAC policy to serve as a primal solver. We empirically demonstrate that the proposed method (i) significantly outperforms state-of-the-art decentralized controllers by 30% on average in terms of coverage cost, (ii) transfers well to larger environments with more robots, and (iii) scalable in the number of IDFs and robots in the swarm.
