Table of Contents
Fetching ...

Breaking Data Silos: Cross-Domain Learning for Multi-Agent Perception from Independent Private Sources

Jinlong Li, Baolu Li, Xinyu Liu, Runsheng Xu, Jiaqi Ma, Hongkai Yu

TL;DR

This work addresses data silos in multi-agent perception caused by independently trained private data across collaborating entities, introducing the Feature Distribution-aware Aggregation (FDA) framework. FDA comprises two components: Learnable Feature Compensation Module (LFCM) and Distribution-aware Statistical Consistency Module (DSCM), designed to reduce inter-agent feature disparities and align feature distributions via Maximum Mean Discrepancy. Integrated as a plug-in within V2V cooperative perception, FDA operates on ego and neighbor agent features, using $\hat{\mathbf{F}}^{cav} = \mathrm{LFCM}(\mathbf{F}^{cav}) + \mathbf{F}^{cav}$ and minimizing $L_{mmd}$ alongside the conventional detection loss $L_{det}$ through $L_{total} = \lambda L_{det} + \omega L_{mmd}$. Experiments on OPV2V and V2XSet show that FDA dramatically mitigates distribution gaps, restoring much of the cooperative perception performance and highlighting the practical value of cross-domain learning in real-world, privacy-constrained multi-agent systems.

Abstract

The diverse agents in multi-agent perception systems may be from different companies. Each company might use the identical classic neural network architecture based encoder for feature extraction. However, the data source to train the various agents is independent and private in each company, leading to the Distribution Gap of different private data for training distinct agents in multi-agent perception system. The data silos by the above Distribution Gap could result in a significant performance decline in multi-agent perception. In this paper, we thoroughly examine the impact of the distribution gap on existing multi-agent perception systems. To break the data silos, we introduce the Feature Distribution-aware Aggregation (FDA) framework for cross-domain learning to mitigate the above Distribution Gap in multi-agent perception. FDA comprises two key components: Learnable Feature Compensation Module and Distribution-aware Statistical Consistency Module, both aimed at enhancing intermediate features to minimize the distribution gap among multi-agent features. Intensive experiments on the public OPV2V and V2XSet datasets underscore FDA's effectiveness in point cloud-based 3D object detection, presenting it as an invaluable augmentation to existing multi-agent perception systems.

Breaking Data Silos: Cross-Domain Learning for Multi-Agent Perception from Independent Private Sources

TL;DR

This work addresses data silos in multi-agent perception caused by independently trained private data across collaborating entities, introducing the Feature Distribution-aware Aggregation (FDA) framework. FDA comprises two components: Learnable Feature Compensation Module (LFCM) and Distribution-aware Statistical Consistency Module (DSCM), designed to reduce inter-agent feature disparities and align feature distributions via Maximum Mean Discrepancy. Integrated as a plug-in within V2V cooperative perception, FDA operates on ego and neighbor agent features, using and minimizing alongside the conventional detection loss through . Experiments on OPV2V and V2XSet show that FDA dramatically mitigates distribution gaps, restoring much of the cooperative perception performance and highlighting the practical value of cross-domain learning in real-world, privacy-constrained multi-agent systems.

Abstract

The diverse agents in multi-agent perception systems may be from different companies. Each company might use the identical classic neural network architecture based encoder for feature extraction. However, the data source to train the various agents is independent and private in each company, leading to the Distribution Gap of different private data for training distinct agents in multi-agent perception system. The data silos by the above Distribution Gap could result in a significant performance decline in multi-agent perception. In this paper, we thoroughly examine the impact of the distribution gap on existing multi-agent perception systems. To break the data silos, we introduce the Feature Distribution-aware Aggregation (FDA) framework for cross-domain learning to mitigate the above Distribution Gap in multi-agent perception. FDA comprises two key components: Learnable Feature Compensation Module and Distribution-aware Statistical Consistency Module, both aimed at enhancing intermediate features to minimize the distribution gap among multi-agent features. Intensive experiments on the public OPV2V and V2XSet datasets underscore FDA's effectiveness in point cloud-based 3D object detection, presenting it as an invaluable augmentation to existing multi-agent perception systems.
Paper Structure (11 sections, 6 equations, 4 figures, 3 tables)

This paper contains 11 sections, 6 equations, 4 figures, 3 tables.

Figures (4)

  • Figure 1: Illustration of the Distribution Gap of different independent private data for training distinct agents in multi-agent perception. Here we use V2X cooperative perception in autonomous driving as an example.
  • Figure 2: Architecture of the proposed Feature Distribution-aware Aggregation (FDA) framework. FDA leverages LFCM to generate residual compensation map to enhance other-agent features, then utilizes DSCM to mitigate Distribution Gaps.
  • Figure 3: 3D object detection visualization.Orange point cloud is ego vehicle, and the white color of point clouds are CAVs. Green and red 3D bounding boxes represent the ground truth and prediction respectively. The detection results of the proposed FDA are clearly more accurate. False detection errors are highlighted using purple arrows.
  • Figure 4: Visualization of intermediate features before and after our FDA. Two samples of point clouds are selected on OPV2V testing set to evaluate CoBEVT xu2023cobevt, where Orange point cloud is ego vehicle. It is evident that after applying the FDA, the intermediate features from CAVs exhibit more similar patterns to those of the ego. Bright pixels may tend to represent the objects.