Heterogeneous Distributed Zeroth-Order Nonconvex Optimization with Communication Compression
Haonan Wang, Xinlei Yi, Yiguang Hong, Minghui Liwang
TL;DR
The paper addresses distributed zeroth-order optimization in heterogeneous networks by introducing HEDZOC, a two-point-gradient estimator with communication compression that does not rely on data homogeneity, per-iteration $ ext{O}(pn)$ evaluations, or a known PL constant. It develops a Lyapunov-based analysis that bounds the gradient estimator variance via the optimality gap, enabling convergence under general nonconvexity and PL conditions. The main results show linear speedup rates in $n$ across three regimes: general nonconvex, PL with unknown constant, and PL with known constant, with compression-induced overhead becoming negligible as the compressor approaches lossless transmission. Simulations on adversarial example generation with MNIST validate the theory, demonstrating strong convergence and significant communication savings even under substantial data heterogeneity. Overall, the work advances practical, scalable distributed zeroth-order optimization by removing classical restrictive assumptions while maintaining fast convergence and communication efficiency.
Abstract
Distributed zeroth-order optimization is increasingly applied in heterogeneous scenarios where agents possess distinct data distributions and objectives. This heterogeneity poses fundamental challenges for convergence analysis, as existing convergence analyses rely on relatively strong assumptions to ensure theoretical guarantees. Specifically, at least one of the following three assumptions is usually required: (i) data homogeneity across agents, (ii) $\mathcal{O}(pn)$ function evaluations per iteration with $p$ denoting the dimension and $n$ the number of agents, or (iii) the Polyak--Łojasiewicz (P--L) or strong convexity condition with a known corresponding constant. To overcome these limitations, we propose a Heterogeneous Distributed Zeroth-Order Compressed (HEDZOC) algorithm, which is based on a two-point zeroth-order gradient estimator and a general class of compressors. Without assuming data homogeneity, we develop the analysis covering three settings: general nonconvex functions, functions satisfying the P--L condition without knowing the P--L constant, and those with a known constant. To the best of our knowledge, the proposed HEDZOC algorithm is the first distributed zeroth-order method that establishes convergence without relying on the above three assumptions. Moreover, it achieves linear speedup convergence rate, which is comparable to state-of-the-art results attainable under data homogeneity and exact communication assumptions. Finally, experiments on heterogeneous adversarial example generation validate the theoretical results.
