Table of Contents
Fetching ...

On the Federated Learning Framework for Cooperative Perception

Zhenrong Zhang, Jianan Liu, Xi Zhou, Tao Huang, Qing-Long Han, Jingxin Liu, Hongbin Liu

TL;DR

The paper tackles privacy-preserving cooperative perception (CP) in V2X-enabled transportation, where data heterogeneity across CP participants degrades federated learning performance. It introduces FedDWA, a dynamic weighted aggregation algorithm, together with DALoss, a KL-divergence–guided regularization term, to align local updates with the global model in BEV perception. Using a BEV transformer backbone and the FedBEVT/OpenV2V data, the approach achieves higher average IoU across four CP clients and reduces the number of rounds to convergence, with ablation confirming the additive benefits of DALoss. The work demonstrates that dynamic weighting and KL-based regularization can significantly improve convergence and accuracy in CP FL settings, supporting scalable, privacy-preserving collaborative perception in intelligent transportation systems.

Abstract

Cooperative perception is essential to enhance the efficiency and safety of future transportation systems, requiring extensive data sharing among vehicles on the road, which raises significant privacy concerns. Federated learning offers a promising solution by enabling data privacy-preserving collaborative enhancements in perception, decision-making, and planning among connected and autonomous vehicles (CAVs). However, federated learning is impeded by significant challenges arising from data heterogeneity across diverse clients, potentially diminishing model accuracy and prolonging convergence periods. This study introduces a specialized federated learning framework for CP, termed the federated dynamic weighted aggregation (FedDWA) algorithm, facilitated by dynamic adjusting loss (DALoss) function. This framework employs dynamic client weighting to direct model convergence and integrates a novel loss function that utilizes Kullback-Leibler divergence (KLD) to counteract the detrimental effects of non-independently and identically distributed (Non-IID) and unbalanced data. Utilizing the BEV transformer as the primary model, our rigorous testing on the OpenV2V dataset, augmented with FedBEVT data, demonstrates significant improvements in the average intersection over union (IoU). These results highlight the substantial potential of our federated learning framework to address data heterogeneity challenges in CP, thereby enhancing the accuracy of environmental perception models and facilitating more robust and efficient collaborative learning solutions in the transportation sector.

On the Federated Learning Framework for Cooperative Perception

TL;DR

The paper tackles privacy-preserving cooperative perception (CP) in V2X-enabled transportation, where data heterogeneity across CP participants degrades federated learning performance. It introduces FedDWA, a dynamic weighted aggregation algorithm, together with DALoss, a KL-divergence–guided regularization term, to align local updates with the global model in BEV perception. Using a BEV transformer backbone and the FedBEVT/OpenV2V data, the approach achieves higher average IoU across four CP clients and reduces the number of rounds to convergence, with ablation confirming the additive benefits of DALoss. The work demonstrates that dynamic weighting and KL-based regularization can significantly improve convergence and accuracy in CP FL settings, supporting scalable, privacy-preserving collaborative perception in intelligent transportation systems.

Abstract

Cooperative perception is essential to enhance the efficiency and safety of future transportation systems, requiring extensive data sharing among vehicles on the road, which raises significant privacy concerns. Federated learning offers a promising solution by enabling data privacy-preserving collaborative enhancements in perception, decision-making, and planning among connected and autonomous vehicles (CAVs). However, federated learning is impeded by significant challenges arising from data heterogeneity across diverse clients, potentially diminishing model accuracy and prolonging convergence periods. This study introduces a specialized federated learning framework for CP, termed the federated dynamic weighted aggregation (FedDWA) algorithm, facilitated by dynamic adjusting loss (DALoss) function. This framework employs dynamic client weighting to direct model convergence and integrates a novel loss function that utilizes Kullback-Leibler divergence (KLD) to counteract the detrimental effects of non-independently and identically distributed (Non-IID) and unbalanced data. Utilizing the BEV transformer as the primary model, our rigorous testing on the OpenV2V dataset, augmented with FedBEVT data, demonstrates significant improvements in the average intersection over union (IoU). These results highlight the substantial potential of our federated learning framework to address data heterogeneity challenges in CP, thereby enhancing the accuracy of environmental perception models and facilitating more robust and efficient collaborative learning solutions in the transportation sector.
Paper Structure (14 sections, 12 equations, 2 figures, 2 tables, 1 algorithm)

This paper contains 14 sections, 12 equations, 2 figures, 2 tables, 1 algorithm.

Figures (2)

  • Figure 1: This figure illustrates the results for four distinct clients. The x-axis represents the number of communication rounds, while the y-axis indicates the IoU values. Subfigure (a) displays the performance for the bus client, subfigure (b) shows the results for the truck client, and subfigures (c) and (d) present the outcomes for car client A and car client B, respectively.
  • Figure 2: This figure is the visualization of model output across different clients. The ego vehicle is on the center. Each panel in this figure presents comparative results for different vehicle clients: (a) Bus, (b) Truck, and (c) Car. For each subfigure, data is organized into six columns. The first column displays camera imagery from the frontal perspective, while the second column shows the rear perspective. The third column represents the ground truth. The fourth column illustrates model output from FedBEVT song2023fedbevt. The fifth column shows the result from FedDWA without DALoss. The final column depicts result from the model trained with both FedDWA and DALoss, showcasing the effectiveness of the proposed method.