Table of Contents
Fetching ...

Riemannian Bilevel Optimization

Jiaxiang Li, Shiqian Ma

TL;DR

Algorithms in both deterministic and stochastic settings, named respectively RieBO and RieSBO, are proposed that include the existing Euclidean bilevel optimization algorithms as special cases and demonstrate the applicability and efficiency of the proposed methods.

Abstract

In this work, we consider the bilevel optimization problem on Riemannian manifolds. We inspect the calculation of the hypergradient of such problems on general manifolds and thus enable the utilization of gradient-based algorithms to solve such problems. The calculation of the hypergradient requires utilizing the notion of Riemannian cross-derivative and we inspect the properties and the numerical calculations of Riemannian cross-derivatives. Algorithms in both deterministic and stochastic settings, named respectively RieBO and RieSBO, are proposed that include the existing Euclidean bilevel optimization algorithms as special cases. Numerical experiments on robust optimization on Riemannian manifolds are presented to show the applicability and efficiency of the proposed methods.

Riemannian Bilevel Optimization

TL;DR

Algorithms in both deterministic and stochastic settings, named respectively RieBO and RieSBO, are proposed that include the existing Euclidean bilevel optimization algorithms as special cases and demonstrate the applicability and efficiency of the proposed methods.

Abstract

In this work, we consider the bilevel optimization problem on Riemannian manifolds. We inspect the calculation of the hypergradient of such problems on general manifolds and thus enable the utilization of gradient-based algorithms to solve such problems. The calculation of the hypergradient requires utilizing the notion of Riemannian cross-derivative and we inspect the properties and the numerical calculations of Riemannian cross-derivatives. Algorithms in both deterministic and stochastic settings, named respectively RieBO and RieSBO, are proposed that include the existing Euclidean bilevel optimization algorithms as special cases. Numerical experiments on robust optimization on Riemannian manifolds are presented to show the applicability and efficiency of the proposed methods.
Paper Structure (11 sections, 12 theorems, 134 equations, 2 figures, 3 tables, 3 algorithms)

This paper contains 11 sections, 12 theorems, 134 equations, 2 figures, 3 tables, 3 algorithms.

Key Result

Proposition 2.1

$\mathrm{grad}_{x,y}^2$ and $\mathrm{grad}_{y,x}^2$ are adjoints, i.e. where $f\in\mathcal{C}^1(\mathcal{M})$ is any continuously differentiable function over $\mathcal{M}$.

Figures (2)

  • Figure 1: The convergence curve of applying Algorithm \ref{['algo_bilevel_robust']} to the robust Karcher mean problem \ref{['robust_karcher']}. The CPU time is in seconds.
  • Figure 2: The convergence curve of Algorithm \ref{['algo_bilevel_robust']} applying to the robust covariance matrix maximum likelihood estimation problem (\ref{['robust_cov_mle']}) with different choice of $(d,n)$. The CPU time is in seconds.

Theorems & Definitions (23)

  • Definition 2.1: Riemannian manifold
  • Definition 2.2: Differential and Riemannian gradients
  • Definition 2.3: Geodesic and exponential mapping
  • Definition 2.4: Geodesic (strong) convexity
  • Definition 2.5: Parallel transport
  • Definition 2.6: Geodesic Lipschitz smoothness
  • Definition 2.7: Riemannian Hessian
  • Definition 2.8: Riemannian cross-derivatives
  • Proposition 2.1
  • Proposition 3.1
  • ...and 13 more