Fast Decentralized Gradient Tracking for Federated Minimax Optimization with Local Updates
Chris Junchi Li
TL;DR
The paper addresses decentralized federated minimax optimization by formulating $f(oldsymbol{x},oldsymbol{y})=rac{1}{n} f_i(oldsymbol{x},oldsymbol{y})$ with $oldsymbol{y}$-strong concavity and $oldsymbol{x}$-variable nonconvexity. It introduces K-GT-Minimax, a gradient-tracking based algorithm that combines local updates to improve communication efficiency and robustness against data heterogeneity in NC-SC settings. The main contribution is a Lyapunov-based convergence analysis yielding explicit rates: with stepsizes $oldsymbol{ abla}_{oldsymbol{y}}$ and $oldsymbol{ abla}_{oldsymbol{x}}$ chosen as functions of $p$, $oldsymbol{ extkappa}$, $K$, and $L$, the method achieves an $oldsymbol{ extvarepsilon}$-stationary point after $T$ rounds, where $T=Oig(rac{oldsymbol{ extsigma}^2}{nK}rac{1}{oldsymbol{ extvarepsilon}^4}+rac{oldsymbol{ extsigma}}{p^2 oot2 ext{ olinebreak} rac{1}{ ext{ olinebreak} oldsymbol{K}}}rac{1}{oldsymbol{ extvarepsilon}^3}+rac{oldsymbol{ extkappa}^3}{p^2}rac{1}{oldsymbol{ extvarepsilon}^2}ig) imes L oldsymbol{ extmathscr{H}}_{0}$ with $K=oldsymbol{ extO}(ig(1+rac{oldsymbol{ extkappa}}{ oot2 ext{ p}}ig)rac{oldsymbol{ extsigma}}{oldsymbol{ extvarepsilon}})$. This yields a balanced rate $T=oldsymbol{ extO}ig(rac{oldsymbol{ extkappa}^3}{p^2oldsymbol{ extvarepsilon}^2}ig)L oldsymbol{ extmathscr{H}}_{0}$, demonstrating improved convergence and enabling scalable, heterogeneous federated minimax training. The results advance decentralized minimax optimization by integrating gradient tracking with local updates to address communication and heterogeneity challenges in practical distributed learning settings.
Abstract
Federated learning (FL) for minimax optimization has emerged as a powerful paradigm for training models across distributed nodes/clients while preserving data privacy and model robustness on data heterogeneity. In this work, we delve into the decentralized implementation of federated minimax optimization by proposing \texttt{K-GT-Minimax}, a novel decentralized minimax optimization algorithm that combines local updates and gradient tracking techniques. Our analysis showcases the algorithm's communication efficiency and convergence rate for nonconvex-strongly-concave (NC-SC) minimax optimization, demonstrating a superior convergence rate compared to existing methods. \texttt{K-GT-Minimax}'s ability to handle data heterogeneity and ensure robustness underscores its significance in advancing federated learning research and applications.
