Table of Contents
Fetching ...

WhisperNet: A Scalable Solution for Bandwidth-Efficient Collaboration

Gong Chen, Chaokun Zhang, Xinyan Zhao

TL;DR

Results demonstrate that globally-coordinated allocation across what and where to share is the key to achieving efficient collaborative perception, and introduces WhisperNet, a bandwidth-aware framework that proposes a novel, receiver-centric paradigm for global coordination across agents.

Abstract

Collaborative perception is vital for autonomous driving yet remains constrained by tight communication budgets. Earlier work reduced bandwidth by compressing full feature maps with fixed-rate encoders, which adapts poorly to a changing environment, and it further evolved into spatial selection methods that improve efficiency by focusing on salient regions, but this object-centric approach often sacrifices global context, weakening holistic scene understanding. To overcome these limitations, we introduce \textit{WhisperNet}, a bandwidth-aware framework that proposes a novel, receiver-centric paradigm for global coordination across agents. Senders generate lightweight saliency metadata, while the receiver formulates a global request plan that dynamically budgets feature contributions across agents and features, retrieving only the most informative features. A collaborative feature routing module then aligns related messages before fusion to ensure structural consistency. Extensive experiments show that WhisperNet achieves state-of-the-art performance, improving AP@0.7 on OPV2V by 2.4\% with only 0.5\% of the communication cost. As a plug-and-play component, it boosts strong baselines with merely 5\% of full bandwidth while maintaining robustness under localization noise. These results demonstrate that globally-coordinated allocation across \textit{what} and \textit{where} to share is the key to achieving efficient collaborative perception.

WhisperNet: A Scalable Solution for Bandwidth-Efficient Collaboration

TL;DR

Results demonstrate that globally-coordinated allocation across what and where to share is the key to achieving efficient collaborative perception, and introduces WhisperNet, a bandwidth-aware framework that proposes a novel, receiver-centric paradigm for global coordination across agents.

Abstract

Collaborative perception is vital for autonomous driving yet remains constrained by tight communication budgets. Earlier work reduced bandwidth by compressing full feature maps with fixed-rate encoders, which adapts poorly to a changing environment, and it further evolved into spatial selection methods that improve efficiency by focusing on salient regions, but this object-centric approach often sacrifices global context, weakening holistic scene understanding. To overcome these limitations, we introduce \textit{WhisperNet}, a bandwidth-aware framework that proposes a novel, receiver-centric paradigm for global coordination across agents. Senders generate lightweight saliency metadata, while the receiver formulates a global request plan that dynamically budgets feature contributions across agents and features, retrieving only the most informative features. A collaborative feature routing module then aligns related messages before fusion to ensure structural consistency. Extensive experiments show that WhisperNet achieves state-of-the-art performance, improving AP@0.7 on OPV2V by 2.4\% with only 0.5\% of the communication cost. As a plug-and-play component, it boosts strong baselines with merely 5\% of full bandwidth while maintaining robustness under localization noise. These results demonstrate that globally-coordinated allocation across \textit{what} and \textit{where} to share is the key to achieving efficient collaborative perception.
Paper Structure (13 sections, 11 equations, 10 figures, 5 tables)

This paper contains 13 sections, 11 equations, 10 figures, 5 tables.

Figures (10)

  • Figure 1: Visualization of different channel groups. The proportional table in the figure represents the model's primary (Orange), secondary (Yellow), and marginal (Green) channels.
  • Figure 2: Impact of Channel Pruning on Performance.
  • Figure 3: The architecture of WhisperNet. Each sender agent first extracts features from its raw sensor data and generates compact importance maps that summarize both spatial and channel-wise saliency. These maps are then exchanged with a receiver agent, which dynamically requests only the most critical feature subsets. Finally, the requested features are fused by a collaborative routing module.
  • Figure 4: Sender-Side Importance Estimation module. It evaluates feature from both spatial and channel perspectives to generate importance maps that guides communication coordination.
  • Figure 5: Collaborative Feature Routing module. It enables cross-vehicle feature exchange by grouping and processing semantically similar channels.
  • ...and 5 more figures