Fast-Convergent and Communication-Alleviated Heterogeneous Hierarchical Federated Learning in Autonomous Driving
Wei-Bin Kou, Qingfeng Lin, Ming Tang, Rongguang Ye, Shuai Wang, Guangxu Zhu, Yik-Chung Wu
TL;DR
The paper addresses inter-city domain shifts in Street Scene Semantic Understanding (TriSU) for autonomous driving and the slow convergence of Hierarchical Federated Learning (HFL) under non-i.i.d. city data. It introduces FedGau, a Gaussian-distribution-based weighting scheme, and AdapRS, a performance-aware adaptive resource scheduler, to accelerate convergence and reduce communication. FedGau models both per-image and per-dataset statistics as Gaussians and uses Bhattacharyya distance $D_B$ to quantify distributional similarity for weighting, achieving substantial speedups and accuracy gains; AdapRS dynamically tunes edge-cloud communication intervals to cut bandwidth use by about 29.65% without sacrificing performance. The framework demonstrates improved generalization and efficiency on Cityscapes and CamVid across multiple backbones, with potential applicability to broader privacy-preserving distributed learning tasks in autonomous driving.
Abstract
Street Scene Semantic Understanding (denoted as TriSU) is a complex task for autonomous driving (AD). However, inference model trained from data in a particular geographical region faces poor generalization when applied in other regions due to inter-city data domain-shift. Hierarchical Federated Learning (HFL) offers a potential solution for improving TriSU model generalization by collaborative privacy-preserving training over distributed datasets from different cities. Unfortunately, it suffers from slow convergence because data from different cities are with disparate statistical properties. Going beyond existing HFL methods, we propose a Gaussian heterogeneous HFL algorithm (FedGau) to address inter-city data heterogeneity so that convergence can be accelerated. In the proposed FedGau algorithm, both single RGB image and RGB dataset are modelled as Gaussian distributions for aggregation weight design. This approach not only differentiates each RGB image by respective statistical distribution, but also exploits the statistics of dataset from each city in addition to the conventionally considered data volume. With the proposed approach, the convergence is accelerated by 35.5\%-40.6\% compared to existing state-of-the-art (SOTA) HFL methods. On the other hand, to reduce the involved communication resource, we further introduce a novel performance-aware adaptive resource scheduling (AdapRS) policy. Unlike the traditional static resource scheduling policy that exchanges a fixed number of models between two adjacent aggregations, AdapRS adjusts the number of model aggregation at different levels of HFL so that unnecessary communications are minimized. Extensive experiments demonstrate that AdapRS saves 29.65\% communication overhead compared to conventional static resource scheduling policy while maintaining almost the same performance.
