Table of Contents
Fetching ...

Multi-Source to Multi-Target Decentralized Federated Domain Adaptation

Su Wang, Seyyedali Hosseinalipour, Christopher G. Brinton

TL;DR

This paper develops a decentralized federated domain adaptation methodology which considers the transfer of ML models from devices with high quality labeled data to devices with low quality or unlabeled data, and develops an algorithm based on successive convex approximations to solve it tractably.

Abstract

Heterogeneity across devices in federated learning (FL) typically refers to statistical (e.g., non-i.i.d. data distributions) and resource (e.g., communication bandwidth) dimensions. In this paper, we focus on another important dimension that has received less attention: varying quantities/distributions of labeled and unlabeled data across devices. In order to leverage all data, we develop a decentralized federated domain adaptation methodology which considers the transfer of ML models from devices with high quality labeled data (called sources) to devices with low quality or unlabeled data (called targets). Our methodology, Source-Target Determination and Link Formation (ST-LF), optimizes both (i) classification of devices into sources and targets and (ii) source-target link formation, in a manner that considers the trade-off between ML model accuracy and communication energy efficiency. To obtain a concrete objective function, we derive a measurable generalization error bound that accounts for estimates of source-target hypothesis deviations and divergences between data distributions. The resulting optimization problem is a mixed-integer signomial program, a class of NP-hard problems, for which we develop an algorithm based on successive convex approximations to solve it tractably. Subsequent numerical evaluations of ST-LF demonstrate that it improves classification accuracy and energy efficiency over state-of-the-art baselines.

Multi-Source to Multi-Target Decentralized Federated Domain Adaptation

TL;DR

This paper develops a decentralized federated domain adaptation methodology which considers the transfer of ML models from devices with high quality labeled data to devices with low quality or unlabeled data, and develops an algorithm based on successive convex approximations to solve it tractably.

Abstract

Heterogeneity across devices in federated learning (FL) typically refers to statistical (e.g., non-i.i.d. data distributions) and resource (e.g., communication bandwidth) dimensions. In this paper, we focus on another important dimension that has received less attention: varying quantities/distributions of labeled and unlabeled data across devices. In order to leverage all data, we develop a decentralized federated domain adaptation methodology which considers the transfer of ML models from devices with high quality labeled data (called sources) to devices with low quality or unlabeled data (called targets). Our methodology, Source-Target Determination and Link Formation (ST-LF), optimizes both (i) classification of devices into sources and targets and (ii) source-target link formation, in a manner that considers the trade-off between ML model accuracy and communication energy efficiency. To obtain a concrete objective function, we derive a measurable generalization error bound that accounts for estimates of source-target hypothesis deviations and divergences between data distributions. The resulting optimization problem is a mixed-integer signomial program, a class of NP-hard problems, for which we develop an algorithm based on successive convex approximations to solve it tractably. Subsequent numerical evaluations of ST-LF demonstrate that it improves classification accuracy and energy efficiency over state-of-the-art baselines.
Paper Structure (24 sections, 10 theorems, 55 equations, 9 figures, 2 tables, 2 algorithms)

This paper contains 24 sections, 10 theorems, 55 equations, 9 figures, 2 tables, 2 algorithms.

Key Result

Theorem 1

Let $\mathcal{H}$ be a hypothesis space of Vapnik–Chervonenkis (VC) dimension $d$, $s$ be a source domain, $t$ be a target domain, and $\widehat{\mathcal{D}}^{\mathsf{S}}_s$, $\widehat{\mathcal{D}}^{\mathsf{T}}_t$ be the empirical distributions induced by samples of size $n$ drawn from $\mathcal{D}^

Figures (9)

  • Figure 1: Small motivating example of a network of 5 smart cars. Only cars B, C, and D have meaningful amounts of labeled data, with cars A and E containing very few or no labeled data. Using a server, FL combines ML models from devices with labeled data, yielding a global model heavily biased for the "blue" domain. Meanwhile, ST-LF uses unlabeled data to estimate pair-wise divergences, then determines source/target selection and source-to-target link formation, leading to individualized ML models without a server.
  • Figure 2: Overview of our ST-LF methodology, where each color represents a different domain. On the left, ST-LF first determines empirical distribution divergences among device pairs through comparison of a binary domain hypothesis/classifier (visualized for device 1). In the middle, ST-LF extracts information about the network environment such as communication energy costs indicted by "Tx Energy Use". Finally, on the right, ST-LF uses these measurements in an optimization problem to determine optimal source/target classification $\boldsymbol{\psi}$ and combination weights $\boldsymbol{\alpha}$ with respect to both expected ML model performance and network energy consumption.
  • Figure 3: An overview of the three common domain adaptation datasets used to evaluate our method. We explain the physical differences of each dataset in Sec. \ref{['sec:experiments']}.
  • Figure 4: Convergence behavior and source/target device classification at convergence for Algorithm \ref{['alg:optimization_iteration']} with two different settings of source errors across devices with labeled/partially labeled data.
  • Figure 5: The effects of uniform, extreme, and random distribution divergence regimes on the behavior of $({\boldsymbol{\mathcal{P}}})$. Each regime occupies a column, showing (i) the divergences $d_{\mathcal{H}\Delta\mathcal{H}}$ between pairs of devices, (ii) the optimized source/target classifications $\boldsymbol{\psi}$, and (iii) the optimized combination weights $\boldsymbol{\alpha}$. The third row breaks down received ML models at targets from source devices 1-5 (i.e., $D:1,\cdots,5$), which are proportional to the divergences.
  • ...and 4 more figures

Theorems & Definitions (16)

  • Definition 1
  • Theorem 1
  • Theorem 2
  • Definition 2
  • Lemma 1
  • Corollary 1
  • Definition 3
  • Lemma 2: Arithmetic-geometric mean inequality duffin1972reversed
  • Theorem 2
  • proof
  • ...and 6 more