Distributed alternating gradient descent for convex semi-infinite programs over a network

Ashwin Aravind; Debasish Chatterjee; Ashish Cherukuri

Distributed alternating gradient descent for convex semi-infinite programs over a network

Ashwin Aravind, Debasish Chatterjee, Ashish Cherukuri

TL;DR

This work addresses solving convex semi-infinite programs (SIPs) in a distributed setting over time-varying networks, where each node holds a local objective and the semi-infinite constraint is shared. It introduces a first-order algorithm that combines consensus, outer gradient steps on the objective, and inner gradient steps on the constraint (via a CoMirror-inspired scheme) to enforce feasibility without projecting onto an infinite constraint set. Theoretical guarantees show that the nodes reach consensus and both the feasibility violation and the suboptimality decay at rate $O(1/\sqrt{K})$, with a finite bound on the inner-iteration count per outer iteration. Numerical experiments, including comparisons with cutting-surface ADMM and scenario-based methods and a robust meta-control example, validate the approach and demonstrate favorable computational efficiency and asymptotic optimality in distributed SIPs.

Abstract

This paper presents a first-order distributed algorithm for solving a convex semi-infinite program (SIP) over a time-varying network. In this setting, the objective function associated with the optimization problem is a summation of a set of functions, each held by one node in a network. The semi-infinite constraint, on the other hand, is known to all agents. The nodes collectively aim to solve the problem using local data about the objective and limited communication capabilities depending on the network topology. Our algorithm is built on three key ingredients: consensus step, gradient descent in the local objective, and local gradient descent iterations in the constraint at a node when the estimate violates the semi-infinite constraint. The algorithm is constructed, and its parameters are prescribed in such a way that the iterates held by each agent provably converge to an optimizer. That is, as the algorithm progresses, the estimates achieve consensus, and the constraint violation and the error in the optimal value are bounded above by vanishing terms. Simulation examples illustrate our results.

Distributed alternating gradient descent for convex semi-infinite programs over a network

TL;DR

, with a finite bound on the inner-iteration count per outer iteration. Numerical experiments, including comparisons with cutting-surface ADMM and scenario-based methods and a robust meta-control example, validate the approach and demonstrate favorable computational efficiency and asymptotic optimality in distributed SIPs.

Abstract

Paper Structure (13 sections, 7 theorems, 38 equations, 4 figures, 1 algorithm)

This paper contains 13 sections, 7 theorems, 38 equations, 4 figures, 1 algorithm.

Introduction
Related works
Contributions
Organization
Notation
Distributed alternating gradient descent
The algorithm
Convergence analysis of Algorithm \ref{['algo:dis_com']}
Numerical experiments
Example for the distributed semi-infinite optimization setup
Comparison results
Robust meta control design
Conclusion

Key Result

Lemma 3

(Strict feasibility implies gradient lower bound for constraint function): Let Assumption ad1 hold. Then, there exists $G_0 > 0$ such that $\min_{x^{}_{}\in\mathbb{L}_{0}\mathopen{}\mathclose{\left( f_{} \right)}}\mathopen{}\mathclose{\left\lVert \nabla_{x^{}_{}} f_{}\mathopen{}\mathclose{\left(x^{}

Figures (4)

Figure 1: Plots illustrating the application of Algorithm \ref{['algo:dis_com']} on problem \ref{['eq:sip_numerical']}. Panels (a), (b), and (c) correspond to the cycle configuration of the network, and (d), (e), and (f) are for the line configuration. The evolution of the objective value at the estimates held by each node are displayed in (a) and (d). Similarly, the evolution of the error between the estimates and the optimizer is given in (b) and (e). Finally, the constraint violations at the estimates are shown in (c) and (f). These plots indicate that the algorithm reaches an optimizer asymptotically.
Figure 2: Comparison of convergence behaviour of Algorithm \ref{['algo:dis_com']}, Distributed Cutting-Surface ADMM AC-AZ-GB-ARH:22 and the Distributed Scenario-based Algorithm (DSA) in KM-AF-SG-MP:18. Performance is evaluated based on both feasibility and optimality of the averaged estimate $\widehat{x}^k$ (i.e., the mean of local estimates across all nodes). Algorithm \ref{['algo:dis_com']} and DCSA exhibit similar and strong performance, with trajectories that closely approach the optimal value and satisfy the feasibility tolerance. In contrast, the DSA methods with 50 and 500 scenarios show significant sub-optimality and infeasibility, highlighting the limitations of scenario-based methods when using too few samples. The DSA variant with 5000 scenarios improves in both aspects, yet still under performs slightly compared to Algorithm \ref{['algo:dis_com']} and DCSA.
Figure 3: Comparison of computational behaviour of Algorithm \ref{['algo:dis_com']}, Distributed Cutting-Surface ADMM AC-AZ-GB-ARH:22 and the Distributed Scenario-based Algorithm (DSA) in KM-AF-SG-MP:18. We plot the average time per iteration for each method, see Section \ref{['sec:comparison']} for details. Algorithm \ref{['algo:dis_com']} completes the full run (5000 iterations) in approximately 37 seconds per node on average, while the DCSA takes around 683 seconds per node and DSA with 5000 scenarios takes about 280 seconds per node. This highlights the significant computational advantage of Algorithm \ref{['algo:dis_com']} in achieving comparable performance. Although DSA with 500 and 50 scenarios exhibits computation times similar to Algorithm \ref{['algo:dis_com']}, the corresponding solution quality is noticeably lower (see Fig. \ref{['fig:comparison']}).
Figure 4: In Fig. \ref{['fig:discon:cntrl']}, we plot the robust meta control sequence $(\mathbbm{u}^t)_{t=0}^{99}$ obtained by solving \ref{['eq:sim:discon:problem']} using Algorithm \ref{['algo:dis_com']}. Fig. \ref{['fig:discon:traj']} shows 1000 state trajectories generated by applying this control sequence to the parameter-dependent system \ref{['eq:sim:discon:dynamics']}, with each trajectory corresponding to a random realization of the uncertain parameter. Notably, the control sequence satisfies the terminal state constraint \ref{['eq:sim:discon:con']} for all realizations, highlighting its robustness.

Theorems & Definitions (16)

Remark 1
Remark 2
Lemma 3
proof
proof
Lemma 4
proof
Proposition 5
proof
Lemma 6
...and 6 more

Distributed alternating gradient descent for convex semi-infinite programs over a network

TL;DR

Abstract

Distributed alternating gradient descent for convex semi-infinite programs over a network

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (4)

Theorems & Definitions (16)