Table of Contents
Fetching ...

Unsupervised Multi-Source Federated Domain Adaptation under Domain Diversity through Group-Wise Discrepancy Minimization

Larissa Reichart, Cem Ata Baykara, Ali Burak Ünal, Harlin Lee, Mete Akgün

TL;DR

This work tackles unsupervised multi-source domain adaptation under privacy constraints, focusing on scalability to many heterogeneous sources. It introduces GALA, a federated UMDA framework that combines inter-group discrepancy minimization (IGD) with a temperature-scaled centroid-based weighting (MDMGB+) to efficiently align diverse sources to an unlabeled target. The approach yields strong, stable performance on standard benchmarks and remains robust as source diversity grows, demonstrated on the new Digit-18 dataset where gains over prior methods are substantial. The contributions offer a practical pathway for scalable federated domain adaptation in real-world, privacy-sensitive environments.

Abstract

Unsupervised multi-source domain adaptation (UMDA) aims to learn models that generalize to an unlabeled target domain by leveraging labeled data from multiple, diverse source domains. While distributed UMDA methods address privacy constraints by avoiding raw data sharing, existing approaches typically assume a small number of sources and fail to scale effectively. Increasing the number of heterogeneous domains often makes existing methods impractical, leading to high computational overhead or unstable performance. We propose GALA, a scalable and robust federated UMDA framework that introduces two key components: (1) a novel inter-group discrepancy minimization objective that efficiently approximates full pairwise domain alignment without quadratic computation; and (2) a temperature-controlled, centroid-based weighting strategy that dynamically prioritizes source domains based on alignment with the target. Together, these components enable stable and parallelizable training across large numbers of heterogeneous sources. To evaluate performance in high-diversity scenarios, we introduce Digit-18, a new benchmark comprising 18 digit datasets with varied synthetic and real-world domain shifts. Extensive experiments show that GALA consistently achieves competitive or state-of-the-art results on standard benchmarks and significantly outperforms prior methods in diverse multi-source settings where others fail to converge.

Unsupervised Multi-Source Federated Domain Adaptation under Domain Diversity through Group-Wise Discrepancy Minimization

TL;DR

This work tackles unsupervised multi-source domain adaptation under privacy constraints, focusing on scalability to many heterogeneous sources. It introduces GALA, a federated UMDA framework that combines inter-group discrepancy minimization (IGD) with a temperature-scaled centroid-based weighting (MDMGB+) to efficiently align diverse sources to an unlabeled target. The approach yields strong, stable performance on standard benchmarks and remains robust as source diversity grows, demonstrated on the new Digit-18 dataset where gains over prior methods are substantial. The contributions offer a practical pathway for scalable federated domain adaptation in real-world, privacy-sensitive environments.

Abstract

Unsupervised multi-source domain adaptation (UMDA) aims to learn models that generalize to an unlabeled target domain by leveraging labeled data from multiple, diverse source domains. While distributed UMDA methods address privacy constraints by avoiding raw data sharing, existing approaches typically assume a small number of sources and fail to scale effectively. Increasing the number of heterogeneous domains often makes existing methods impractical, leading to high computational overhead or unstable performance. We propose GALA, a scalable and robust federated UMDA framework that introduces two key components: (1) a novel inter-group discrepancy minimization objective that efficiently approximates full pairwise domain alignment without quadratic computation; and (2) a temperature-controlled, centroid-based weighting strategy that dynamically prioritizes source domains based on alignment with the target. Together, these components enable stable and parallelizable training across large numbers of heterogeneous sources. To evaluate performance in high-diversity scenarios, we introduce Digit-18, a new benchmark comprising 18 digit datasets with varied synthetic and real-world domain shifts. Extensive experiments show that GALA consistently achieves competitive or state-of-the-art results on standard benchmarks and significantly outperforms prior methods in diverse multi-source settings where others fail to converge.

Paper Structure

This paper contains 42 sections, 1 theorem, 9 equations, 10 figures, 7 tables, 1 algorithm.

Key Result

Theorem 1

Let $\mathcal{H}$ be the model space, and let $w_1,\dots,w_N\in\mathbb{R}_+$ satisfy $\sum_{n=1}^N w_n=1$. Then for any $h\in\mathcal{H}$, where $\lambda_0$ is a constant for the task error of the optimal model.

Figures (10)

  • Figure 1: Performance across Digit-Five targets for increasing source domains. KD3A is excluded beyond 9 sources due to exponential runtime.
  • Figure 2: Test accuracy over training rounds for four Digit-Five target domains in the full Digit-18 setup.
  • Figure 3: Test accuracy over training rounds for four Digit-18 target domains in the full Digit-18 setup.
  • Figure 4: Effect of $\tau$ in GALA on adaptation performance for SVHNXS and MNIST-M (Digit-18).
  • Figure 5: Effect of $\tau$ in GALA on adaptation performance for SVHN and MNIST-M (Digit-Five).
  • ...and 5 more figures

Theorems & Definitions (1)

  • Theorem 1: NEURIPS2018_717d8b3d