CoopDiff: A Diffusion-Guided Approach for Cooperation under Corruptions

Gong Chen; Chaokun Zhang; Pengcheng Lv

CoopDiff: A Diffusion-Guided Approach for Cooperation under Corruptions

Gong Chen, Chaokun Zhang, Pengcheng Lv

TL;DR

Benefiting from the inherent denoising properties of diffusion, CoopDiff consistently outperforms prior methods across all degradation types and lowers the relative corruption error, and offers a tunable balance between precision and inference efficiency.

Abstract

Cooperative perception lets agents share information to expand coverage and improve scene understanding. However, in real-world scenarios, diverse and unpredictable corruptions undermine its robustness and generalization. To address these challenges, we introduce CoopDiff, a diffusion-based cooperative perception framework that mitigates corruptions via a denoising mechanism. CoopDiff adopts a teacher-student paradigm: the Quality-Aware Teacher performs voxel-level early fusion with Quality of Interest weighting and semantic guidance, then produces clean supervision features via a diffusion denoiser. The Dual-Branch Diffusion Student first separates ego and cooperative streams in encoding to reconstruct the teacher's clean targets. And then, an Ego-Guided Cross-Attention mechanism facilitates balanced decoding under degradation by adaptively integrating ego and cooperative features. We evaluate CoopDiff on two constructed multi-degradation benchmarks, OPV2Vn and DAIR-V2Xn, each incorporating six corruption types, including environmental and sensor-level distortions. Benefiting from the inherent denoising properties of diffusion, CoopDiff consistently outperforms prior methods across all degradation types and lowers the relative corruption error. Furthermore, it offers a tunable balance between precision and inference efficiency.

CoopDiff: A Diffusion-Guided Approach for Cooperation under Corruptions

TL;DR

Abstract

Paper Structure (12 sections, 14 equations, 7 figures, 8 tables)

This paper contains 12 sections, 14 equations, 7 figures, 8 tables.

Introduction
Related Work
Method
Quality-Aware Early Fusion Teacher
Dual-Branch Denoising Student
Loss
Experiments
Datasets and Experimental Settings
Quantitative evaluation
Component Analysis
Qualitative Evaluation
Conclusion

Figures (7)

Figure 1: Overall performance comparison of average results on eight state-of-art methods across six corruption types.
Figure 2: Overview of the proposed CoopDiff, which employs a Teacher-Student paradigm. The Quality-Aware Teacher model $\mathcal{D}_{\Psi}^{\text{tea}}$ uses an early-fusion strategy to process multi-agent inputs and generate a clean target feature map. The Dual-Branch Denoising Student $\mathcal{D}_{\theta}^{\text{stu}}$ is trained to reconstruct the target by leveraging local and cooperative conditions.
Figure 3: Overview of the Quality-Aware Early Fusion Teacher. Multi-agent features are first fused via Quality of Interest (QoI) weighting and semantic guidance, after which the GCM-based diffusion network denoises the input. The right shows the architecture of GCM.
Figure 4: Architecture of the Cooperative Deformable Attention (CDA) module.
Figure 5: Robustness of our proposed method and compared benchmark to varying levels of corruption.
...and 2 more figures

CoopDiff: A Diffusion-Guided Approach for Cooperation under Corruptions

TL;DR

Abstract

CoopDiff: A Diffusion-Guided Approach for Cooperation under Corruptions

Authors

TL;DR

Abstract

Table of Contents

Figures (7)