Communication-Efficient Gradient Descent-Accent Methods for Distributed Variational Inequalities: Unified Analysis and Local Updates
Siqi Zhang, Sayantan Choudhury, Sebastian U Stich, Nicolas Loizou
TL;DR
The paper tackles distributed variational inequality problems in federated learning, addressing communication bottlenecks by developing a unified ProxSkip-VIP framework that accommodates non-monotone VIPs under a general stochastic-estimator model. It recasts distributed VIPs into a consensus form with a proximal regularizer, enabling randomized proximal-skipping and control variates to reduce expensive proximal updates while preserving convergence to the VIP solution. The authors provide tight convergence guarantees for ProxSkip-VIP and its specializations (SGDA, GDA, L-SVRGDA), with explicit iteration and communication complexities that improve upon traditional local-update approaches, even without bounded heterogeneity assumptions. In federated/minimax applications, the results yield acceleration in communication rounds and robust performance under data heterogeneity, supported by numerical experiments on strongly monotone quadratic games and robust least-squares problems. Overall, the work offers a principled, unified, and scalable approach for communication-efficient distributed VIPs with practical FL implementations.
Abstract
Distributed and federated learning algorithms and techniques associated primarily with minimization problems. However, with the increase of minimax optimization and variational inequality problems in machine learning, the necessity of designing efficient distributed/federated learning approaches for these problems is becoming more apparent. In this paper, we provide a unified convergence analysis of communication-efficient local training methods for distributed variational inequality problems (VIPs). Our approach is based on a general key assumption on the stochastic estimates that allows us to propose and analyze several novel local training algorithms under a single framework for solving a class of structured non-monotone VIPs. We present the first local gradient descent-accent algorithms with provable improved communication complexity for solving distributed variational inequalities on heterogeneous data. The general algorithmic framework recovers state-of-the-art algorithms and their sharp convergence guarantees when the setting is specialized to minimization or minimax optimization problems. Finally, we demonstrate the strong performance of the proposed algorithms compared to state-of-the-art methods when solving federated minimax optimization problems.
