Table of Contents
Fetching ...

PARDON: Privacy-Aware and Robust Federated Domain Generalization

Dung Thuy Nguyen, Taylor T. Johnson, Kevin Leach

TL;DR

The paper tackles domain shift in federated learning by targeting scenarios where clients hold data from multiple, unseen domains and direct cross-client data sharing risks privacy. It introduces PARDON, a privacy-aware FedDG framework that learns a global interpolative style by two-level FINCH clustering of local styles and uses AdaIN-based style transfer combined with a multi-domain contrastive loss and embedding regularization to align local models with a global representation. The global aggregation follows the standard FedAvg formulation $G = \frac{1}{N}\sum_{i=1}^K n_i G_i$ with $n_i$ the local data size and $N=\sum_i n_i$, while the interpolative style is computed as $S_g = \mathrm{median}(S(\varepsilon_j)\;|\; \varepsilon_j \in \Gamma_L)$. Empirically, PARDON outperforms state-of-the-art FedDG baselines on PACS, Office-Home, and IWildCam, achieving unseen-domain improvements ranging from $3.64\%$ to $57.22\%$, and exhibits strong robustness to domain heterogeneity and client sampling with low computational overhead and enhanced privacy than cross-client style sharing.

Abstract

Federated Learning (FL) shows promise in preserving privacy and enabling collaborative learning. However, most current solutions focus on private data collected from a single domain. A significant challenge arises when client data comes from diverse domains (i.e., domain shift), leading to poor performance on unseen domains. Existing Federated Domain Generalization approaches address this problem but assume each client holds data for an entire domain, limiting their practicality in real-world scenarios with domain-based heterogeneity and client sampling. In addition, certain methods enable information sharing among clients, raising privacy concerns as this information could be used to reconstruct sensitive private data. To overcome this, we introduce FISC, a novel FedDG paradigm designed to robustly handle more complicated domain distributions between clients while ensuring security. FISC enables learning across domains by extracting an interpolative style from local styles and employing contrastive learning. This strategy gives clients multi-domain representations and unbiased convergent targets. Empirical results on multiple datasets, including PACS, Office-Home, and IWildCam, show FISC outperforms state-of-the-art (SOTA) methods. Our method achieves accuracy on unseen domains, with improvements ranging from 3.64% to 57.22% on unseen domains. Our code is available at https://github.com/judydnguyen/PARDON-FedDG.

PARDON: Privacy-Aware and Robust Federated Domain Generalization

TL;DR

The paper tackles domain shift in federated learning by targeting scenarios where clients hold data from multiple, unseen domains and direct cross-client data sharing risks privacy. It introduces PARDON, a privacy-aware FedDG framework that learns a global interpolative style by two-level FINCH clustering of local styles and uses AdaIN-based style transfer combined with a multi-domain contrastive loss and embedding regularization to align local models with a global representation. The global aggregation follows the standard FedAvg formulation with the local data size and , while the interpolative style is computed as . Empirically, PARDON outperforms state-of-the-art FedDG baselines on PACS, Office-Home, and IWildCam, achieving unseen-domain improvements ranging from to , and exhibits strong robustness to domain heterogeneity and client sampling with low computational overhead and enhanced privacy than cross-client style sharing.

Abstract

Federated Learning (FL) shows promise in preserving privacy and enabling collaborative learning. However, most current solutions focus on private data collected from a single domain. A significant challenge arises when client data comes from diverse domains (i.e., domain shift), leading to poor performance on unseen domains. Existing Federated Domain Generalization approaches address this problem but assume each client holds data for an entire domain, limiting their practicality in real-world scenarios with domain-based heterogeneity and client sampling. In addition, certain methods enable information sharing among clients, raising privacy concerns as this information could be used to reconstruct sensitive private data. To overcome this, we introduce FISC, a novel FedDG paradigm designed to robustly handle more complicated domain distributions between clients while ensuring security. FISC enables learning across domains by extracting an interpolative style from local styles and employing contrastive learning. This strategy gives clients multi-domain representations and unbiased convergent targets. Empirical results on multiple datasets, including PACS, Office-Home, and IWildCam, show FISC outperforms state-of-the-art (SOTA) methods. Our method achieves accuracy on unseen domains, with improvements ranging from 3.64% to 57.22% on unseen domains. Our code is available at https://github.com/judydnguyen/PARDON-FedDG.

Paper Structure

This paper contains 13 sections, 9 equations, 8 figures, 5 tables.

Figures (8)

  • Figure 1: Loss landscape visualization of two clients under domain-based heterogeneity, with normal training (first row) and our method using interpolative style-transferred data (second row). The vertical axis shows the loss (i.e., solution for the global optimum $G$ and each local objective $G_i$). The horizontal plane represents a parameter space centered at the global model weight. The feature visualization in the last column demonstrates that our method achieves superior classification results on unseen domains.
  • Figure 2: The PARDON framework we present in this paper. Clients calculate their style information using their local data and a pre-trained encoder in Step ①. The server extracts interpolation style by weighted averaging the local style information in Step ②. Then, in Step ③, local clients exploit the interpolation style to obtain style-transferred data and update their models via contrastive learning. The aggregated global model in Step ④ runs inference on unseen data.
  • Figure 3: Convergence curve when the training round increases on PACS's Sketch, and the training domains are Art-Painting and Cartoon; decreasing domain heterogeneity from left to right. $\lambda = 0$ means domain separation while $\lambda = 1.0$ means homogeneous domain distribution.
  • Figure 4: Computational overhead of FedDG methods.
  • Figure 5: Accuracy comparison with different settings of selected clients (K) per total number of clients (N).
  • ...and 3 more figures

Theorems & Definitions (4)

  • Definition 1: Domain
  • Definition 2: Domain Generalization
  • Definition 3: Domain Shift in FL
  • Definition 4: Domain-based Client Heterogeneity