Table of Contents
Fetching ...

Pragmatic Communication in Multi-Agent Collaborative Perception

Yue Hu, Xianghe Pang, Xiaoqi Qin, Yonina C. Eldar, Siheng Chen, Ping Zhang, Wenjun Zhang

TL;DR

PragComm tackles the perception-communication bottleneck in multi-agent collaborative perception by introducing a pragmatic strategy that transmits only task-critical information. It decomposes communication into spatial-temporal selection, channel-efficient codebook representation, and selective collaboration links, forming an ADMM-like optimization framework that alternates between message determination and network parameter updates. The PragComm system implements a single-agent detector/tracker and a pragmatic collaboration module, consisting of spatial/temporal/channel compressors, a shared codebook, and a sparse communication graph, enabling robust 3D object detection and tracking under varied bandwidths. Empirical results on OPV2V, V2V4Real, and V2X-SIM2.0 demonstrate orders-of-magnitude reductions in communication volume with superior or competitive perception performance, validating the effectiveness of task-oriented, temporally coherent collaboration. The work advances practical, scalable multi-agent perception by aligning information exchange with downstream task utility and resource constraints.

Abstract

Collaborative perception allows each agent to enhance its perceptual abilities by exchanging messages with others. It inherently results in a trade-off between perception ability and communication costs. Previous works transmit complete full-frame high-dimensional feature maps among agents, resulting in substantial communication costs. To promote communication efficiency, we propose only transmitting the information needed for the collaborator's downstream task. This pragmatic communication strategy focuses on three key aspects: i) pragmatic message selection, which selects task-critical parts from the complete data, resulting in spatially and temporally sparse feature vectors; ii) pragmatic message representation, which achieves pragmatic approximation of high-dimensional feature vectors with a task-adaptive dictionary, enabling communicating with integer indices; iii) pragmatic collaborator selection, which identifies beneficial collaborators, pruning unnecessary communication links. Following this strategy, we first formulate a mathematical optimization framework for the perception-communication trade-off and then propose PragComm, a multi-agent collaborative perception system with two key components: i) single-agent detection and tracking and ii) pragmatic collaboration. The proposed PragComm promotes pragmatic communication and adapts to a wide range of communication conditions. We evaluate PragComm for both collaborative 3D object detection and tracking tasks in both real-world, V2V4Real, and simulation datasets, OPV2V and V2X-SIM2.0. PragComm consistently outperforms previous methods with more than 32.7K times lower communication volume on OPV2V. Code is available at github.com/PhyllisH/PragComm.

Pragmatic Communication in Multi-Agent Collaborative Perception

TL;DR

PragComm tackles the perception-communication bottleneck in multi-agent collaborative perception by introducing a pragmatic strategy that transmits only task-critical information. It decomposes communication into spatial-temporal selection, channel-efficient codebook representation, and selective collaboration links, forming an ADMM-like optimization framework that alternates between message determination and network parameter updates. The PragComm system implements a single-agent detector/tracker and a pragmatic collaboration module, consisting of spatial/temporal/channel compressors, a shared codebook, and a sparse communication graph, enabling robust 3D object detection and tracking under varied bandwidths. Empirical results on OPV2V, V2V4Real, and V2X-SIM2.0 demonstrate orders-of-magnitude reductions in communication volume with superior or competitive perception performance, validating the effectiveness of task-oriented, temporally coherent collaboration. The work advances practical, scalable multi-agent perception by aligning information exchange with downstream task utility and resource constraints.

Abstract

Collaborative perception allows each agent to enhance its perceptual abilities by exchanging messages with others. It inherently results in a trade-off between perception ability and communication costs. Previous works transmit complete full-frame high-dimensional feature maps among agents, resulting in substantial communication costs. To promote communication efficiency, we propose only transmitting the information needed for the collaborator's downstream task. This pragmatic communication strategy focuses on three key aspects: i) pragmatic message selection, which selects task-critical parts from the complete data, resulting in spatially and temporally sparse feature vectors; ii) pragmatic message representation, which achieves pragmatic approximation of high-dimensional feature vectors with a task-adaptive dictionary, enabling communicating with integer indices; iii) pragmatic collaborator selection, which identifies beneficial collaborators, pruning unnecessary communication links. Following this strategy, we first formulate a mathematical optimization framework for the perception-communication trade-off and then propose PragComm, a multi-agent collaborative perception system with two key components: i) single-agent detection and tracking and ii) pragmatic collaboration. The proposed PragComm promotes pragmatic communication and adapts to a wide range of communication conditions. We evaluate PragComm for both collaborative 3D object detection and tracking tasks in both real-world, V2V4Real, and simulation datasets, OPV2V and V2X-SIM2.0. PragComm consistently outperforms previous methods with more than 32.7K times lower communication volume on OPV2V. Code is available at github.com/PhyllisH/PragComm.
Paper Structure (28 sections, 21 equations, 17 figures)

This paper contains 28 sections, 21 equations, 17 figures.

Figures (17)

  • Figure 1: Communication strategies in collaborative perception. (a) Shannon communication employs lossless compression techniques to compress data to messages and is lossless for general tasks while causing substantial communication costs. (b) Previous communication methods leverage lossy compression and follow an all-or-nothing strategy to compress task-critical and task-irrelevant data without distinction, compromising task utility. (c) Our pragmatic communication retains only task-critical data with code indices in the pragmatic messages, this is, transmitting the demanded foreground object dynamics to each collaborator, reducing communication costs while retaining task utility.
  • Figure 2: PragComm achieves a collaborative object detection and tracking system. Collaboration enhances individual perceptual features with communication-enabled pragmatic messages, and message compression ensures efficient communication.
  • Figure 3: Overview of the spatial and temporal compressor. The spatial compressor picks out the perceptual critical foreground regions. The temporal compressor has two options: when reaching the updating frequency it selects all regions (the upper branch), or it picks out the dynamic regions (the bottom branch).
  • Figure 4: Overview of the channel compressor. The channel compressor transforms the dense feature representation into the lightweight code index representation.
  • Figure 5: Overview of the prediction module. It aligns the historical feature map with the current timestamp, using a warp function and the predicted displacements obtained in the flow estimation function.
  • ...and 12 more figures