Exact Decentralized Optimization via Explicit $\ell_1$ Consensus Penalties
Hong Wang
TL;DR
This work tackles decentralized consensus optimization under memory and communication constraints by reformulating the problem with an explicit $\ell_1$ penalty that enables exact consensus once a computable threshold is exceeded. It introduces a modular two-layer penalty-continuation framework, where an outer loop governs the penalty and an inner plug-and-play saddle-point solver (instantiated as DP$^2$G, a proximal-gradient variant) achieves consensus with fixed steps and minimal per-node memory. The authors prove global convergence via the Kurdyka-Łojasiewicz framework, vanishing disagreement, and linear rates under strong convexity, and validate the approach on distributed ridge, logistic, and elastic-net tasks across networks, showing favorable speed and communication efficiency relative to DGD-type methods and competitiveness with gradient-tracking. This penalty-based exactness approach offers a tracker-free yet accurate decentralized optimization paradigm with practical applicability to composite objectives and constrained settings.
Abstract
Consensus optimization enables autonomous agents to solve joint tasks through peer-to-peer exchanges alone. Classical decentralized gradient descent is appealing for its minimal state but fails to achieve exact consensus with fixed stepsizes unless additional trackers or dual variables are introduced. We revisit penalty methods and introduce a decentralized two-layer framework that couples an outer penalty-continuation loop with an inner plug-and-play saddle-point solver. Any primal-dual routine that satisfies simple stationarity and communication conditions can be used; when instantiated with a proximal-gradient solver, the framework yields the DP$^2$G algorithm, which reaches exact consensus with constant stepsizes, stores only one dual residual per agent, and requires exactly two short message exchanges per inner iteration. An explicit $\ell_1$ penalty enforces agreement and, once above a computable threshold, makes the penalized and constrained problems equivalent. Leveraging the Kurdyka-Łojasiewicz property, we prove global convergence, vanishing disagreement, and linear rates for strongly convex objectives under any admissible inner solver. Experiments on distributed least squares, logistic regression, and elastic-net tasks across various networks demonstrate that DP$^2$G outperforms DGD-type methods in both convergence speed and communication efficiency, is competitive with gradient-tracking approaches while using less memory, and naturally accommodates composite objectives.
