Graphon Particle Systems, Part II: Dynamics of Distributed Stochastic Continuum Optimization

Yan Chen; Tao Li; Xiaofeng Zong

Graphon Particle Systems, Part II: Dynamics of Distributed Stochastic Continuum Optimization

Yan Chen, Tao Li, Xiaofeng Zong

TL;DR

This work studies distributed optimization over a graphon, modeling a continuum of nodes to capture large-scale networks. It introduces continuous-time D-SGD and D-SGT algorithms with time-varying gains, proves a general bound lemma for time-varying differential inequalities, and establishes uniform second-moment bounds, $L^{2}$-consensus, and convergence to the global minimizer $x^{*}$ when local costs are strongly convex. A decoupling method for coupled inequalities enables mean-square convergence results for the D-SGT algorithm, including convergence of auxiliary states to the global gradient value. The results provide rigorous links between $L^{2}$- and $L^{\infty}$-consensus on graphons and demonstrate robust convergence in the mean-square sense, with simulations validating the theory. This framework advances understanding of distributed optimization in the graphon limit and informs design choices for gains and initialization in large-scale heterogeneous networks.

Abstract

We study the distributed optimization problem over a graphon with a continuum of nodes, which is regarded as the limit of the distributed networked optimization as the number of nodes goes to infinity. Each node has a private local cost function. The global cost function, which all nodes cooperatively minimize, is the integral of the local cost functions on the node set. We propose stochastic gradient descent and gradient tracking algorithms over the graphon. We establish a general lemma for the upper bound estimation related to a class of time-varying differential inequalities with negative linear terms, based upon which, we prove that for both kinds of algorithms, the second moments of the nodes' states are uniformly bounded. Especially, for the stochastic gradient tracking algorithm, we transform the convergence analysis into the asymptotic property of coupled nonlinear differential inequalities with time-varying coefficients and develop a decoupling method. For both kinds of algorithms, we show that by choosing the time-varying algorithm gains properly, all nodes' states achieve $\mathcal{L}^{\infty}$-consensus for a connected graphon. Furthermore, if the local cost functions are strongly convex, then all nodes' states converge to the minimizer of the global cost function and the auxiliary states in the stochastic gradient tracking algorithm converge to the gradient value of the global cost function at the minimizer uniformly in mean square.

Graphon Particle Systems, Part II: Dynamics of Distributed Stochastic Continuum Optimization

TL;DR

-consensus, and convergence to the global minimizer

when local costs are strongly convex. A decoupling method for coupled inequalities enables mean-square convergence results for the D-SGT algorithm, including convergence of auxiliary states to the global gradient value. The results provide rigorous links between

- and

-consensus on graphons and demonstrate robust convergence in the mean-square sense, with simulations validating the theory. This framework advances understanding of distributed optimization in the graphon limit and informs design choices for gains and initialization in large-scale heterogeneous networks.

Abstract

-consensus for a connected graphon. Furthermore, if the local cost functions are strongly convex, then all nodes' states converge to the minimizer of the global cost function and the auxiliary states in the stochastic gradient tracking algorithm converge to the gradient value of the global cost function at the minimizer uniformly in mean square.

Paper Structure (8 sections, 12 theorems, 125 equations, 3 figures)

This paper contains 8 sections, 12 theorems, 125 equations, 3 figures.

Introduction
Preliminaries
Main Results
Relationship Between $\mathcal{L}^{2}$-Consensus and $\mathcal{L}^{\infty}$-Consensus
Convergence of D-SGD Algorithm
Convergence of D-SGT Algorithm
Simulations
Conclusions and Future Works

Key Result

Lemma 2.1

(Benoit Bonnet) The graphon $W$ is connected in the sense of Definition stronglyconnecte if and only if $\lambda_2(\mathbb{L}_{W})>0$.

Figures (3)

Figure 1: Graphon $A$
Figure 2: Left: Mean square errors between states and $x^{*}$, $N=500$, $\Delta t=0.1$; Right: Mean square errors between states and $x^{*}$ for various network sizes, $\Delta t=0.1$.
Figure 3: Mean square errors between states and $x^{*}$ for various step-sizes, $N=500$.

Theorems & Definitions (17)

Definition 2.1
Lemma 2.1
Lemma 3.1
Lemma 3.2
Theorem 3.1
Remark 3.1
Lemma 3.3
Lemma 3.4
Theorem 3.2
Remark 3.2
...and 7 more

Graphon Particle Systems, Part II: Dynamics of Distributed Stochastic Continuum Optimization

TL;DR

Abstract

Graphon Particle Systems, Part II: Dynamics of Distributed Stochastic Continuum Optimization

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (3)

Theorems & Definitions (17)