DiffCP: Ultra-Low Bit Collaborative Perception via Diffusion Model

Ruiqing Mao; Haotian Wu; Yukuan Jia; Zhaojun Nan; Yuxuan Sun; Sheng Zhou; Deniz Gündüz; Zhisheng Niu

DiffCP: Ultra-Low Bit Collaborative Perception via Diffusion Model

Ruiqing Mao, Haotian Wu, Yukuan Jia, Zhaojun Nan, Yuxuan Sun, Sheng Zhou, Deniz Gündüz, Zhisheng Niu

TL;DR

DiffCP introduces a diffusion-model–based framework for ultra-low-bandwidth collaborative perception by reconstructing the co-agent BEV features at the ego agent using geometric and semantic conditioning. It transmits only a compact semantic vector and leverages a latent BEV diffusion model to recover the co-agent BEV distribution, enabling feature-level collaboration within object-level data rates. On OPV2V, DiffCP achieves about a 14.5x reduction in data-rate while maintaining state-of-the-art perception performance and demonstrates a flexible Top-K augmentation for high-precision tasks. The approach acts as a plug-in to existing BEV-based CP pipelines, facilitating deployment of connected intelligent systems under congested wireless channels.

Abstract

Collaborative perception (CP) is emerging as a promising solution to the inherent limitations of stand-alone intelligence. However, current wireless communication systems are unable to support feature-level and raw-level collaborative algorithms due to their enormous bandwidth demands. In this paper, we propose DiffCP, a novel CP paradigm that utilizes a specialized diffusion model to efficiently compress the sensing information of collaborators. By incorporating both geometric and semantic conditions into the generative model, DiffCP enables feature-level collaboration with an ultra-low communication cost, advancing the practical implementation of CP systems. This paradigm can be seamlessly integrated into existing CP algorithms to enhance a wide range of downstream tasks. Through extensive experimentation, we investigate the trade-offs between communication, computation, and performance. Numerical results demonstrate that DiffCP can significantly reduce communication costs by 14.5-fold while maintaining the same performance as the state-of-the-art algorithm.

DiffCP: Ultra-Low Bit Collaborative Perception via Diffusion Model

TL;DR

Abstract

Paper Structure (16 sections, 7 equations, 4 figures, 2 tables)

This paper contains 16 sections, 7 equations, 4 figures, 2 tables.

Introduction
Related Work
Collaborative Perception (CP)
Diffusion Model
Methodology
Data Processing
Latent BEV Diffusion Formulation
Learning to Control the Viewpoint
Learning to Incorporate Collaborative Information
Downstream-Aware Augmentation
Experimental Results
Experiment Settings
Trade-off Between Computation Time and Reconstruction Accuracy
Trade-off Between Data Rate and Reconstruction Accuracy
Case Study: 3D Object Detection
...and 1 more sections

Figures (4)

Figure 1: An illustration of CP in IUSs. Compared with previous feature-level CP, the proposed DiffCP leverages the ego-agent's prior geometric to recover feature information from the co-agent. This enables feature-level CP within object-level data rate requirements, satisfying real-world wireless communication constraints.
Figure 2: The overall architecture of DiffCP. The size of the feature maps is provided as an example to demonstrate the compression functionality. Left: During the training process, the model is trained using noised BEV features from the co-agent to learn the denoising process. Right: During the inference process, pure noise features are input to reconstruct the co-agent's BEV features, while only the semantic feature vectors are transmitted through the wireless channel.
Figure 3: Visualization of the reconstructed BEV features in different sampling steps ($L=512$).
Figure 4: Performance under various data rates.

DiffCP: Ultra-Low Bit Collaborative Perception via Diffusion Model

TL;DR

Abstract

DiffCP: Ultra-Low Bit Collaborative Perception via Diffusion Model

Authors

TL;DR

Abstract

Table of Contents

Figures (4)