Table of Contents
Fetching ...

Streamlining in the Riemannian Realm: Efficient Riemannian Optimization with Loopless Variance Reduction

Yury Demidovich, Grigory Malinovsky, Peter Richtárik

TL;DR

This study introduces the Riemannian Loopless SVRG (R-LSVRG) and PAGE (R-PAGE) methods, which replace the outer loop with probabilistic gradient computation triggered by a coin flip in each iteration, ensuring simpler proofs, efficient hyperparameter selection, and sharp convergence guarantees.

Abstract

In this study, we investigate stochastic optimization on Riemannian manifolds, focusing on the crucial variance reduction mechanism used in both Euclidean and Riemannian settings. Riemannian variance-reduced methods usually involve a double-loop structure, computing a full gradient at the start of each loop. Determining the optimal inner loop length is challenging in practice, as it depends on strong convexity or smoothness constants, which are often unknown or hard to estimate. Motivated by Euclidean methods, we introduce the Riemannian Loopless SVRG (R-LSVRG) and PAGE (R-PAGE) methods. These methods replace the outer loop with probabilistic gradient computation triggered by a coin flip in each iteration, ensuring simpler proofs, efficient hyperparameter selection, and sharp convergence guarantees. Using R-PAGE as a framework for non-convex Riemannian optimization, we demonstrate its applicability to various important settings. For example, we derive Riemannian MARINA (R-MARINA) for distributed settings with communication compression, providing the best theoretical communication complexity guarantees for non-convex distributed optimization over Riemannian manifolds. Experimental results support our theoretical findings.

Streamlining in the Riemannian Realm: Efficient Riemannian Optimization with Loopless Variance Reduction

TL;DR

This study introduces the Riemannian Loopless SVRG (R-LSVRG) and PAGE (R-PAGE) methods, which replace the outer loop with probabilistic gradient computation triggered by a coin flip in each iteration, ensuring simpler proofs, efficient hyperparameter selection, and sharp convergence guarantees.

Abstract

In this study, we investigate stochastic optimization on Riemannian manifolds, focusing on the crucial variance reduction mechanism used in both Euclidean and Riemannian settings. Riemannian variance-reduced methods usually involve a double-loop structure, computing a full gradient at the start of each loop. Determining the optimal inner loop length is challenging in practice, as it depends on strong convexity or smoothness constants, which are often unknown or hard to estimate. Motivated by Euclidean methods, we introduce the Riemannian Loopless SVRG (R-LSVRG) and PAGE (R-PAGE) methods. These methods replace the outer loop with probabilistic gradient computation triggered by a coin flip in each iteration, ensuring simpler proofs, efficient hyperparameter selection, and sharp convergence guarantees. Using R-PAGE as a framework for non-convex Riemannian optimization, we demonstrate its applicability to various important settings. For example, we derive Riemannian MARINA (R-MARINA) for distributed settings with communication compression, providing the best theoretical communication complexity guarantees for non-convex distributed optimization over Riemannian manifolds. Experimental results support our theoretical findings.
Paper Structure (23 sections, 28 theorems, 140 equations, 1 figure, 1 algorithm)

This paper contains 23 sections, 28 theorems, 140 equations, 1 figure, 1 algorithm.

Key Result

Lemma 1

If $a$, $b$, and $c$ represent the sides (i.e., side lengths) of a geodesic triangle in an Alexandrov space with curvature lower bounded by $\kappa_{\min}$, and $A$ denotes the angle between sides $b$ and $c$, then the following distance bound holds:

Figures (1)

  • Figure 1: Comparison of the R-LSVRG and R-SVRG methods: Left panel illustrates convergence in terms of gradient norm, while the right panel depicts convergence in terms of function values.

Theorems & Definitions (56)

  • Definition 1: Riemannian gradient
  • Lemma 1: zhang2016first Lemma 6
  • Corollary 1
  • Definition 2: Curvature-driven manifold term
  • Theorem 1
  • Corollary 2
  • Corollary 3
  • Theorem 2
  • Corollary 4
  • Corollary 5
  • ...and 46 more