Table of Contents
Fetching ...

On Linear Convergence of Distributed Stochastic Bilevel Optimization over Undirected Networks via Gradient Aggregation

Ajay Tak, Mayank Baranwal

TL;DR

This work tackles bilevel distributed optimization over undirected networks by introducing the BDASG algorithm, which couples gradient aggregation with consensus to solve distributed bilevel problems without a central coordinator. The main theoretical contribution is proving linear convergence in expectation to a neighborhood of the global optimum when the aggregate objective $f(x)=\sum_{i=1}^n f_i(x)$ is strongly convex, relaxing requirements on local convexity; the authors also discuss plausibility of linear convergence under the $PL$ condition. The analysis decouples consensus and optimization by proving a gradient-reduction property for $Y$ and a PL-based contraction for the averaged iterate $\bar{x}$, with the contraction rate controlled by the network spectral gap $\sigma_2$ and the step size. Numerical experiments on distributed sensor networks and rank-deficient distributed linear regression validate the method, showing robust, scalable performance on undirected networks. The work thus broadens applicability of distributed bilevel optimization with minimal structural assumptions and provides a practical algorithm for large-scale networked learning and sensing.

Abstract

Many large-scale constrained optimization problems can be formulated as bilevel distributed optimization tasks over undirected networks, where agents collaborate to minimize a global cost function while adhering to constraints, relying only on local communication and computation. In this work, we propose a distributed stochastic gradient aggregation scheme and establish its linear convergence under the weak assumption of global strong convexity, which relaxes the common requirement of local function convexity on the objective and constraint functions. Specifically, we prove that the algorithm converges at a linear rate when the global objective function (and not each local objective function) satisfies strong-convexity. Our results significantly extend existing theoretical guarantees for distributed bilevel optimization. Additionally, we demonstrate the effectiveness of our approach through numerical experiments on distributed sensor network problems and distributed linear regression with rank-deficient data.

On Linear Convergence of Distributed Stochastic Bilevel Optimization over Undirected Networks via Gradient Aggregation

TL;DR

This work tackles bilevel distributed optimization over undirected networks by introducing the BDASG algorithm, which couples gradient aggregation with consensus to solve distributed bilevel problems without a central coordinator. The main theoretical contribution is proving linear convergence in expectation to a neighborhood of the global optimum when the aggregate objective is strongly convex, relaxing requirements on local convexity; the authors also discuss plausibility of linear convergence under the condition. The analysis decouples consensus and optimization by proving a gradient-reduction property for and a PL-based contraction for the averaged iterate , with the contraction rate controlled by the network spectral gap and the step size. Numerical experiments on distributed sensor networks and rank-deficient distributed linear regression validate the method, showing robust, scalable performance on undirected networks. The work thus broadens applicability of distributed bilevel optimization with minimal structural assumptions and provides a practical algorithm for large-scale networked learning and sensing.

Abstract

Many large-scale constrained optimization problems can be formulated as bilevel distributed optimization tasks over undirected networks, where agents collaborate to minimize a global cost function while adhering to constraints, relying only on local communication and computation. In this work, we propose a distributed stochastic gradient aggregation scheme and establish its linear convergence under the weak assumption of global strong convexity, which relaxes the common requirement of local function convexity on the objective and constraint functions. Specifically, we prove that the algorithm converges at a linear rate when the global objective function (and not each local objective function) satisfies strong-convexity. Our results significantly extend existing theoretical guarantees for distributed bilevel optimization. Additionally, we demonstrate the effectiveness of our approach through numerical experiments on distributed sensor network problems and distributed linear regression with rank-deficient data.

Paper Structure

This paper contains 5 sections, 11 theorems, 55 equations, 1 figure, 1 algorithm.

Key Result

Lemma 1

For any ${\mathbf{x}}\in{\mathbb R}^d$, the following holds: $\|{\mathbf{x}}\|_2\leq\|{\mathbf{x}}\|_1\leq\sqrt{d}\|{\mathbf{x}}\|_2$.

Figures (1)

  • Figure 1: Plots for Distributed sensor network problem on 150-node network -- (a) network topology, (b) convergence behavior. Plots for distributed linear regression with rank deficiency problem on (c) star graph, and (d) ring graph, respectively.

Theorems & Definitions (26)

  • Definition 1: PL-inequality
  • Definition 2: $L$-smooth
  • Lemma 1: Norm equivalence
  • Lemma 2: Co-coercivity Vu2017ConvexOptimization
  • Remark 1
  • Remark 2
  • Remark 3
  • Lemma 3
  • proof
  • Remark 4
  • ...and 16 more