Table of Contents
Fetching ...

Federated Compositional Deep AUC Maximization

Xinwen Zhang, Yihan Zhang, Tianbao Yang, Richard Souvenir, Hongchang Gao

TL;DR

This work tackles imbalanced data in federated learning by formulating a federated compositional deep AUC maximization problem. It proposes LocalSCGDAM, a local stochastic compositional gradient descent ascent algorithm with momentum and periodic synchronization of a moving-average inner-level estimator to handle two-level distributed functions. Theoretical results establish a convergence rate with linear speedup in the number of devices and quantified communication complexity, while empirical results on six imbalanced image datasets demonstrate superior AUC performance over strong baselines. The approach offers a principled, scalable solution for private, imbalanced, large-scale learning in cross-silo federated settings.

Abstract

Federated learning has attracted increasing attention due to the promise of balancing privacy and large-scale learning; numerous approaches have been proposed. However, most existing approaches focus on problems with balanced data, and prediction performance is far from satisfactory for many real-world applications where the number of samples in different classes is highly imbalanced. To address this challenging problem, we developed a novel federated learning method for imbalanced data by directly optimizing the area under curve (AUC) score. In particular, we formulate the AUC maximization problem as a federated compositional minimax optimization problem, develop a local stochastic compositional gradient descent ascent with momentum algorithm, and provide bounds on the computational and communication complexities of our algorithm. To the best of our knowledge, this is the first work to achieve such favorable theoretical results. Finally, extensive experimental results confirm the efficacy of our method.

Federated Compositional Deep AUC Maximization

TL;DR

This work tackles imbalanced data in federated learning by formulating a federated compositional deep AUC maximization problem. It proposes LocalSCGDAM, a local stochastic compositional gradient descent ascent algorithm with momentum and periodic synchronization of a moving-average inner-level estimator to handle two-level distributed functions. Theoretical results establish a convergence rate with linear speedup in the number of devices and quantified communication complexity, while empirical results on six imbalanced image datasets demonstrate superior AUC performance over strong baselines. The approach offers a principled, scalable solution for private, imbalanced, large-scale learning in cross-silo federated settings.

Abstract

Federated learning has attracted increasing attention due to the promise of balancing privacy and large-scale learning; numerous approaches have been proposed. However, most existing approaches focus on problems with balanced data, and prediction performance is far from satisfactory for many real-world applications where the number of samples in different classes is highly imbalanced. To address this challenging problem, we developed a novel federated learning method for imbalanced data by directly optimizing the area under curve (AUC) score. In particular, we formulate the AUC maximization problem as a federated compositional minimax optimization problem, develop a local stochastic compositional gradient descent ascent with momentum algorithm, and provide bounds on the computational and communication complexities of our algorithm. To the best of our knowledge, this is the first work to achieve such favorable theoretical results. Finally, extensive experimental results confirm the efficacy of our method.
Paper Structure (18 sections, 16 theorems, 93 equations, 5 figures, 4 tables, 1 algorithm)

This paper contains 18 sections, 16 theorems, 93 equations, 5 figures, 4 tables, 1 algorithm.

Key Result

Theorem 1

Given Assumption assumption_smooth-assumption_strong, by setting $\alpha>0$, $\beta_x>0$, $\beta_y>0$, $\eta \leq \min\{\frac{1}{2\gamma_x L_{\phi}}, \frac{1}{10p^2\gamma_yL_f}, \frac{1}{\alpha}, \frac{1}{\beta_x}, \frac{1}{\beta_y}, 1\}$, $\gamma_y \leq \min\{ \frac{1}{6L_f}, \frac{3\mu\beta_y^

Figures (5)

  • Figure 1: Testing performance with AUC score versus the number of iterations when the communication period $p=4$.
  • Figure 2: Testing performance with AUC score versus the number of iterations when the communication period $p=8$.
  • Figure 3: Testing performance with AUC score versus the number of iterations when the communication period $p=16$.
  • Figure 4: The test AUC score versus the number of iterations when using different imbalance ratios for CATvsDOG.
  • Figure 5: The test AUROC score for STL10.

Theorems & Definitions (33)

  • Theorem 1
  • Remark 1
  • Lemma 1
  • proof
  • Lemma 2
  • proof
  • Lemma 3
  • proof
  • Lemma 4
  • proof
  • ...and 23 more