A Single-Loop Algorithm for Decentralized Bilevel Optimization

Youran Dong; Shiqian Ma; Junfeng Yang; Chao Yin

A Single-Loop Algorithm for Decentralized Bilevel Optimization

Youran Dong, Shiqian Ma, Junfeng Yang, Chao Yin

TL;DR

This work tackles decentralized bilevel optimization with a strongly convex lower level by introducing SLDBO, a fully single-loop algorithm that uses only two matrix-vector multiplications per iteration. It combines gradient tracking with a projection step to remove the need for gradient-heterogeneity assumptions, and provides an $O(1/K)$ convergence rate for stationarity along with consensus guarantees. The authors validate the approach on synthetic and MNIST-based hyperparameter problems, showing faster convergence and reduced communication overhead compared to baselines. The method enhances scalability of distributed bilevel optimization and broadens its applicability in privacy-preserving, networked settings.

Abstract

Bilevel optimization has gained significant attention in recent years due to its broad applications in machine learning. This paper focuses on bilevel optimization in decentralized networks and proposes a novel single-loop algorithm for solving decentralized bilevel optimization with a strongly convex lower-level problem. Our approach is a fully single-loop method that approximates the hypergradient using only two matrix-vector multiplications per iteration. Importantly, our algorithm does not require any gradient heterogeneity assumption, distinguishing it from existing methods for decentralized bilevel optimization and federated bilevel optimization. Our analysis demonstrates that the proposed algorithm achieves the best-known convergence rate for bilevel optimization algorithms. We also present experimental results on hyperparameter optimization problems using both synthetic and MNIST datasets, which demonstrate the efficiency of our proposed algorithm.

A Single-Loop Algorithm for Decentralized Bilevel Optimization

TL;DR

convergence rate for stationarity along with consensus guarantees. The authors validate the approach on synthetic and MNIST-based hyperparameter problems, showing faster convergence and reduced communication overhead compared to baselines. The method enhances scalability of distributed bilevel optimization and broadens its applicability in privacy-preserving, networked settings.

Abstract

Paper Structure (15 sections, 12 theorems, 86 equations, 4 figures, 1 algorithm)

This paper contains 15 sections, 12 theorems, 86 equations, 4 figures, 1 algorithm.

Introduction
Main contributions.
Notation.
A Single-Loop Algorithm for Decentralized Bilevel Optimization
Assumptions
The Proposed SLDBO Algorithm
Convergence Rate Results for SLDBO
Numerical Experiments
Synthetic Data
Real-World Data
Concluding Remarks
Proof of the Convergence Results
Notation, Constants and Basic Lemmas
Consensus Error of Algorithm \ref{['alg:slDB']}
Convergence Rate of Algorithm \ref{['alg:slDB']}

Key Result

Theorem 3.1

For any integer $K\geq 1$, when $0\leq k\leq K$, define $\bar{x}^k = \frac{1}{n}\sum_{i=1}^{n}x^k_i$, $\bar{y}^k = \frac{1}{n}\sum_{i=1}^{n}y^k_i$ and $\bar{v}^k = \frac{1}{n}\sum_{i=1}^{n}v^k_i$. The following convergence rate results hold for Algorithm alg:slDB.

Figures (4)

Figure 1: Comparison between MA-DSBO and SLDBO on synthetic data ($p=50$).
Figure 2: Comparison between MA-DSBO and SLDBO on synthetic data ($p=200$).
Figure 3: Comparison between MA-DSBO, SLDBO (w/o proj.) and SLDBO on synthetic data. Dimension: $p=50$. Heterogeneity rate: $r=1$ (left), $r=40$ (right).
Figure 4: Comparison of test loss, train loss, and classification accuracy between MA-DSBO and SLDBO on real-world MNIST dataset.

Theorems & Definitions (27)

Remark 2.1
Theorem 3.1
Lemma A.1
proof
Lemma A.2
Lemma A.3
proof
Remark A.1
Lemma A.4
proof
...and 17 more

A Single-Loop Algorithm for Decentralized Bilevel Optimization

TL;DR

Abstract

A Single-Loop Algorithm for Decentralized Bilevel Optimization

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (4)

Theorems & Definitions (27)