Federated Source-free Domain Adaptation for Classification: Weighted Cluster Aggregation for Unlabeled Data
Junki Mori, Kosuke Kihara, Taiki Miyagawa, Akinori F. Ebihara, Isamu Teranishi, Hisashi Kashima
TL;DR
This work tackles Federated source-Free Domain Adaptation (FFREEDA) for classification, where a server holds a source-trained model and clients possess unlabeled data from diverse target domains. It introduces FedWCA, a three-phase method: (1) private, parameter-free clustering of clients using only first-layer parameters via FINCH to form domain-based clusters, (2) Weighted Cluster Aggregation that blends cluster models with client-specific weights computed from prototypes and Soft Neighborhood Density, and (3) local adaptation with a two-phase pseudo-labeling strategy incorporating prototypes and mixup to leverage unlabeled data. Empirically, FedWCA outperforms baselines including LADD and FedPCL+PL on Digit-Five, PACS, and Office-Home, with ablations confirming the value of cluster weighting, clustering layers, and the pseudo-labeling design. The method advances practical FFREEDA by reducing privacy risks, minimizing hyperparameters, and enabling effective cross-domain knowledge transfer for personalized classification in non-i.i.d. federated settings.
Abstract
Federated learning (FL) commonly assumes that the server or some clients have labeled data, which is often impractical due to annotation costs and privacy concerns. Addressing this problem, we focus on a source-free domain adaptation task, where (1) the server holds a pre-trained model on labeled source domain data, (2) clients possess only unlabeled data from various target domains, and (3) the server and clients cannot access the source data in the adaptation phase. This task is known as Federated source-Free Domain Adaptation (FFREEDA). Specifically, we focus on classification tasks, while the previous work solely studies semantic segmentation. Our contribution is the novel Federated learning with Weighted Cluster Aggregation (FedWCA) method, designed to mitigate both domain shifts and privacy concerns with only unlabeled data. FedWCA comprises three phases: private and parameter-free clustering of clients to obtain domain-specific global models on the server, weighted aggregation of the global models for the clustered clients, and local domain adaptation with pseudo-labeling. Experimental results show that FedWCA surpasses several existing methods and baselines in FFREEDA, establishing its effectiveness and practicality.
