Table of Contents
Fetching ...

Review of Mathematical Optimization in Federated Learning

Shusen Yang, Fangyuan Zhao, Zihao Zhou, Liang Shi, Xuebin Ren, Zongben Xu

TL;DR

This survey synthesizes the mathematical optimization landscape of Federated Learning, addressing how non-i.i.d. data, differential privacy, decentralized topologies, and online data streams affect problem formulation, algorithm design, and convergence guarantees. It categorizes and analyzes core optimization methods (first-, second-, and zeroth-order) and surveys advanced strategies to mitigate data heterogeneity, privacy noise, and topology-induced biases, including regularization, interpolation, variance reduction, topology-aware optimization, and robust aggregation. Key contributions include a structured mapping of convergence results under various FL settings, practical mitigation techniques, and a roadmap of future directions emphasizing theory, system constraints, and privacy-enhancing technologies. The work underscores the practical significance of tailoring optimization methods to FL’s unique constraints to achieve reliable, privacy-preserving, and scalable distributed learning in real-world deployments.

Abstract

Federated Learning (FL) has been becoming a popular interdisciplinary research area in both applied mathematics and information sciences. Mathematically, FL aims to collaboratively optimize aggregate objective functions over distributed datasets while satisfying a variety of privacy and system constraints.Different from conventional distributed optimization methods, FL needs to address several specific issues (e.g., non-i.i.d. data distributions and differential private noises), which pose a set of new challenges in the problem formulation, algorithm design, and convergence analysis. In this paper, we will systematically review existing FL optimization research including their assumptions, formulations, methods, and theoretical results. Potential future directions are also discussed.

Review of Mathematical Optimization in Federated Learning

TL;DR

This survey synthesizes the mathematical optimization landscape of Federated Learning, addressing how non-i.i.d. data, differential privacy, decentralized topologies, and online data streams affect problem formulation, algorithm design, and convergence guarantees. It categorizes and analyzes core optimization methods (first-, second-, and zeroth-order) and surveys advanced strategies to mitigate data heterogeneity, privacy noise, and topology-induced biases, including regularization, interpolation, variance reduction, topology-aware optimization, and robust aggregation. Key contributions include a structured mapping of convergence results under various FL settings, practical mitigation techniques, and a roadmap of future directions emphasizing theory, system constraints, and privacy-enhancing technologies. The work underscores the practical significance of tailoring optimization methods to FL’s unique constraints to achieve reliable, privacy-preserving, and scalable distributed learning in real-world deployments.

Abstract

Federated Learning (FL) has been becoming a popular interdisciplinary research area in both applied mathematics and information sciences. Mathematically, FL aims to collaboratively optimize aggregate objective functions over distributed datasets while satisfying a variety of privacy and system constraints.Different from conventional distributed optimization methods, FL needs to address several specific issues (e.g., non-i.i.d. data distributions and differential private noises), which pose a set of new challenges in the problem formulation, algorithm design, and convergence analysis. In this paper, we will systematically review existing FL optimization research including their assumptions, formulations, methods, and theoretical results. Potential future directions are also discussed.

Paper Structure

This paper contains 36 sections, 10 equations, 5 figures, 4 tables, 1 algorithm.

Figures (5)

  • Figure 1: The overview of the survey.
  • Figure 2: General framework and workflow of FL.
  • Figure 3: Illustration of model update trajectories under i.i.d. (a) and non-i.i.d. (b) settings in FL for two clients with $E$ local iterations karimireddy2020scaffold.
  • Figure 4: Typical workflow of FL with DP.
  • Figure 5: Illustration of network topology for fully connected topology (a), partially connected topology (b), time-varying connected topology (c).