Table of Contents
Fetching ...

Overcoming Data and Model Heterogeneities in Decentralized Federated Learning via Synthetic Anchors

Chun-Yin Huang, Kartik Srinivas, Xin Zhang, Xiaoxiao Li

TL;DR

This paper tackles the challenge of data and model heterogeneity in decentralized federated learning by introducing DeSA, a serverless method that relies on synthetic anchors to enable mutual learning across heterogeneous clients. The approach comprises local data synthesis, synthetic anchor generation via distribution matching, and two regularization mechanisms: REG for latent space alignment and KD for cross-client knowledge transfer, all conducted without public data. The authors provide theoretical generalization bounds showing how synthetic anchors can reduce domain discrepancy and improve cross-domain performance, along with practical algorithms like FastMix for decentralized aggregation. Empirically, DeSA delivers improved inter- and intra-client accuracy across multiple domain-shift benchmarks (DIGITS, OFFICE, CIFAR10C) under both heterogeneous and homogeneous model settings, and demonstrates robustness to ablations and privacy considerations.

Abstract

Conventional Federated Learning (FL) involves collaborative training of a global model while maintaining user data privacy. One of its branches, decentralized FL, is a serverless network that allows clients to own and optimize different local models separately, which results in saving management and communication resources. Despite the promising advancements in decentralized FL, it may reduce model generalizability due to lacking a global model. In this scenario, managing data and model heterogeneity among clients becomes a crucial problem, which poses a unique challenge that must be overcome: How can every client's local model learn generalizable representation in a decentralized manner? To address this challenge, we propose a novel Decentralized FL technique by introducing Synthetic Anchors, dubbed as DeSA. Based on the theory of domain adaptation and Knowledge Distillation (KD), we theoretically and empirically show that synthesizing global anchors based on raw data distribution facilitates mutual knowledge transfer. We further design two effective regularization terms for local training: 1) REG loss that regularizes the distribution of the client's latent embedding with the anchors and 2) KD loss that enables clients to learn from others. Through extensive experiments on diverse client data distributions, we showcase the effectiveness of DeSA in enhancing both inter- and intra-domain accuracy of each client.

Overcoming Data and Model Heterogeneities in Decentralized Federated Learning via Synthetic Anchors

TL;DR

This paper tackles the challenge of data and model heterogeneity in decentralized federated learning by introducing DeSA, a serverless method that relies on synthetic anchors to enable mutual learning across heterogeneous clients. The approach comprises local data synthesis, synthetic anchor generation via distribution matching, and two regularization mechanisms: REG for latent space alignment and KD for cross-client knowledge transfer, all conducted without public data. The authors provide theoretical generalization bounds showing how synthetic anchors can reduce domain discrepancy and improve cross-domain performance, along with practical algorithms like FastMix for decentralized aggregation. Empirically, DeSA delivers improved inter- and intra-client accuracy across multiple domain-shift benchmarks (DIGITS, OFFICE, CIFAR10C) under both heterogeneous and homogeneous model settings, and demonstrates robustness to ablations and privacy considerations.

Abstract

Conventional Federated Learning (FL) involves collaborative training of a global model while maintaining user data privacy. One of its branches, decentralized FL, is a serverless network that allows clients to own and optimize different local models separately, which results in saving management and communication resources. Despite the promising advancements in decentralized FL, it may reduce model generalizability due to lacking a global model. In this scenario, managing data and model heterogeneity among clients becomes a crucial problem, which poses a unique challenge that must be overcome: How can every client's local model learn generalizable representation in a decentralized manner? To address this challenge, we propose a novel Decentralized FL technique by introducing Synthetic Anchors, dubbed as DeSA. Based on the theory of domain adaptation and Knowledge Distillation (KD), we theoretically and empirically show that synthesizing global anchors based on raw data distribution facilitates mutual knowledge transfer. We further design two effective regularization terms for local training: 1) REG loss that regularizes the distribution of the client's latent embedding with the anchors and 2) KD loss that enables clients to learn from others. Through extensive experiments on diverse client data distributions, we showcase the effectiveness of DeSA in enhancing both inter- and intra-domain accuracy of each client.
Paper Structure (42 sections, 4 theorems, 24 equations, 12 figures, 10 tables, 1 algorithm)

This paper contains 42 sections, 4 theorems, 24 equations, 12 figures, 10 tables, 1 algorithm.

Key Result

Theorem 1

Denote the client $C_i$'s model as $M_i = \rho_i \circ \psi_i \in {\mathcal{\boldsymbol{P}}}_i \circ \Psi_i = \mathcal{M}_i$ and its overall source distribution as $P_i^S$ with component weights ($\boldsymbol{\alpha}$). Then the generalization error on the global data distribution $P^T$ can be bound where $\mathbf{C}(P_i, P_T)$ are small distance terms depending on the distributions $P_i$ and $P_T

Figures (12)

  • Figure 1: The decision boundary before (a) and after (b) applying our proposed $\mathcal{L}_{\rm REG}$ (Eq. \ref{['eq:lsgd']}) and $\mathcal{L}_{\rm KD}$ (Eq. \ref{['eq:kd']}) using our synthetic anchor data. $\mathcal{L}_{\rm REG}$ aims to group the raw feature towards synthetic anchor feature, and $\mathcal{L}_{\rm KD}$ twists the local decision boundary towards the generalized decision boundary.
  • Figure 2: Heterogeneous setup and DeSA pipeline. (a) We assume a realistic FL scenario, where clients have different data distributions and computational powers, which results in different model architectures. (b) DeSA pipeline consists of three phases, local data synthesis (top left) , global synthetic data aggregation (top right)(Section \ref{['sec:data_generation']}), and decentralized training (bottom) using anchor regularization(Section \ref{['sec:reg_loss']}) and knowledge distillation (Section \ref{['sec:kd_loss']}).
  • Figure 3: Ablation studies for $\lambda$'s using OFFICE. We report the averaged global accuracy when changing the $\lambda$ values.
  • Figure 4: MIA on the models trained by SVHN, SynthDigits, and MNIST-M clients. Observe that the synthetic data sharing of DeSA does not reveal other clients' local data identity information.
  • Figure 5: Visualization of the global and local synthetic images from the DIGITS dataset. (a) visualized the MNIST client; (b) visualized the SVHN client; (c) visualized the USPS client; (d) visualized the SynthDigits client; (e) visualized the MNIST-M client; (f) visualized the server synthetic data.
  • ...and 7 more figures

Theorems & Definitions (12)

  • Definition 4.1
  • Definition 4.2
  • Theorem 1
  • Remark 4.3
  • Proposition 2
  • proof
  • proof
  • Lemma 1
  • proof
  • Lemma 2: Appendix A feng2021kd3a
  • ...and 2 more