Improving the Bit Complexity of Communication for Distributed Convex Optimization
Mehrdad Ghadiri, Yin Tat Lee, Swati Padmanabhan, William Swartworth, David Woodruff, Guanghao Ye
TL;DR
This work provides near-tight, bit-efficient bounds for distributed convex optimization in two communication models: a coordinator and a blackboard. By introducing block leverage scores, non-adaptive adaptive sketches, and a mid-bit Richardson iteration, the authors obtain improved upper bounds for problems including $ ext{ℓ}_2$ and $ ext{ℓ}_p$ regression, low-rank approximation, and high-accuracy linear programming, as well as decomposable nonsmooth finite-sum minimization in the blackboard model. They pair these algorithms with new lower bounds based on a novel $s$-player inner-product game and spherical Radon transform arguments to establish tightness and a first separation between LP feasibility and linear systems in polynomial-constraint settings. The results extend to well-conditioned inputs and decomposable problem structures, offering practical bit-complexity improvements and establishing a rich toolkit (block leverage scores, inverse maintenance, spectral embeddings, and barrier-based IPMs) for distributed optimization at the bit level. Overall, the paper advances the understanding of how to minimize communication costs in distributed convex optimization while preserving accuracy, with broad implications for large-scale federated and distributed systems.
Abstract
We consider the communication complexity of some fundamental convex optimization problems in the point-to-point (coordinator) and blackboard communication models. We strengthen known bounds for approximately solving linear regression, $p$-norm regression (for $1\leq p\leq 2$), linear programming, minimizing the sum of finitely many convex nonsmooth functions with varying supports, and low rank approximation; for a number of these fundamental problems our bounds are nearly optimal, as proven by our lower bounds. Among our techniques, we use the notion of block leverage scores, which have been relatively unexplored in this context, as well as dropping all but the ``middle" bits in Richardson-style algorithms. We also introduce a new communication problem for accurately approximating inner products and establish a lower bound using the spherical Radon transform. Our lower bound can be used to show the first separation of linear programming and linear systems in the distributed model when the number of constraints is polynomial, addressing an open question in prior work.
