Fractional Order Distributed Optimization
Andrei Lixandru, Marcel van Gerven, Sergio Pequito
TL;DR
The paper tackles slow and unstable convergence in distributed optimization on directed graphs with ill-conditioned objectives. It introduces FrODO, a framework that integrates fractional-order memory into local gradient updates, yielding updates of the form $x_i^{k+1} = x_i^k - \alpha g_i^k - \beta M_i^k$ where $M_i^k$ aggregates past gradients with a power-law weight. A convergence theorem shows linear convergence $O(\rho^k)$ for appropriate parameters and a memory effect captured by $C(\lambda)$, complemented by complexity analysis and experiments demonstrating substantial speedups in ill-conditioned problems (up to ~4x) and federated neural network training (2–3x) while preserving stability. The work provides practical guidelines for parameter choices (e.g., $\lambda$ around 0.1–0.2 and memory length $T\ge 80$) and suggests broader applicability to distributed control and multi-agent learning where long-term memory can stabilize optimization trajectories.
Abstract
Distributed optimization is fundamental to modern machine learning applications like federated learning, but existing methods often struggle with ill-conditioned problems and face stability-versus-speed tradeoffs. We introduce fractional order distributed optimization (FrODO); a theoretically-grounded framework that incorporates fractional-order memory terms to enhance convergence properties in challenging optimization landscapes. Our approach achieves provable linear convergence for any strongly connected network. Through empirical validation, our results suggest that FrODO achieves up to 4 times faster convergence versus baselines on ill-conditioned problems and 2-3 times speedup in federated neural network training, while maintaining stability and theoretical guarantees.
