Table of Contents
Fetching ...

A Framework for Bilevel Optimization on Riemannian Manifolds

Andi Han, Bamdev Mishra, Pratik Jawanpuria, Akiko Takeda

TL;DR

This study introduces a framework for solving bilevel optimization problems, where the variables in both the lower and upper levels are constrained on Riemannian manifolds, and extends it to encompass stochastic bilevel optimization and incorporate the use of general retraction.

Abstract

Bilevel optimization has gained prominence in various applications. In this study, we introduce a framework for solving bilevel optimization problems, where the variables in both the lower and upper levels are constrained on Riemannian manifolds. We present several hypergradient estimation strategies on manifolds and analyze their estimation errors. Furthermore, we provide comprehensive convergence and complexity analyses for the proposed hypergradient descent algorithm on manifolds. We also extend our framework to encompass stochastic bilevel optimization and incorporate the use of general retraction. The efficacy of the proposed framework is demonstrated through several applications.

A Framework for Bilevel Optimization on Riemannian Manifolds

TL;DR

This study introduces a framework for solving bilevel optimization problems, where the variables in both the lower and upper levels are constrained on Riemannian manifolds, and extends it to encompass stochastic bilevel optimization and incorporate the use of general retraction.

Abstract

Bilevel optimization has gained prominence in various applications. In this study, we introduce a framework for solving bilevel optimization problems, where the variables in both the lower and upper levels are constrained on Riemannian manifolds. We present several hypergradient estimation strategies on manifolds and analyze their estimation errors. Furthermore, we provide comprehensive convergence and complexity analyses for the proposed hypergradient descent algorithm on manifolds. We also extend our framework to encompass stochastic bilevel optimization and incorporate the use of general retraction. The efficacy of the proposed framework is demonstrated through several applications.
Paper Structure (40 sections, 19 theorems, 95 equations, 3 figures, 4 tables, 4 algorithms)

This paper contains 40 sections, 19 theorems, 95 equations, 3 figures, 4 tables, 4 algorithms.

Key Result

Proposition 1

The differential of $y^*(x)$ and the Riemannian hypergradient of $F(x)$ are given by

Figures (3)

  • Figure 1: Figures (a) & (b) show the plot of objective of the upper-level problem (Upper Objective) for different strategies. HINV and CG strategies have fastest convergence, followed by NS and AD. The corresponding estimation errors are shown in (c). Figure (d) specifically shows the robustness of approximation error obtained by NS across different $\gamma$ and $T$ values.
  • Figure 2: Figures (a), (b), and (c) show the performance of RHGD on the hyper-representation problems on SPD networks. Figure (d) shows the good generalization performance of our proposed RHGD algorithms over the projected gradient PHGD baselines on the MiniImageNet dataset.
  • Figure : Riemannian stochastic bilevel optimization with Hessian inverse.

Theorems & Definitions (35)

  • Proposition 1
  • Definition 1: Lipschitzness
  • Definition 2: $\epsilon$-stationary point
  • Lemma 1: Hypergradient approximation error bound
  • Theorem 1
  • Theorem 2
  • Theorem 3
  • Proposition 2: boumal2023introduction
  • Lemma 2: sun2019escapinghan2023riemna
  • Lemma 3: Trigonometric distance bound zhang2016firstzhang2016riemannianhan2021riemannian
  • ...and 25 more