Streamlining in the Riemannian Realm: Efficient Riemannian Optimization with Loopless Variance Reduction

Yury Demidovich; Grigory Malinovsky; Peter Richtárik

Streamlining in the Riemannian Realm: Efficient Riemannian Optimization with Loopless Variance Reduction

Yury Demidovich, Grigory Malinovsky, Peter Richtárik

TL;DR

This study introduces the Riemannian Loopless SVRG (R-LSVRG) and PAGE (R-PAGE) methods, which replace the outer loop with probabilistic gradient computation triggered by a coin flip in each iteration, ensuring simpler proofs, efficient hyperparameter selection, and sharp convergence guarantees.

Abstract

In this study, we investigate stochastic optimization on Riemannian manifolds, focusing on the crucial variance reduction mechanism used in both Euclidean and Riemannian settings. Riemannian variance-reduced methods usually involve a double-loop structure, computing a full gradient at the start of each loop. Determining the optimal inner loop length is challenging in practice, as it depends on strong convexity or smoothness constants, which are often unknown or hard to estimate. Motivated by Euclidean methods, we introduce the Riemannian Loopless SVRG (R-LSVRG) and PAGE (R-PAGE) methods. These methods replace the outer loop with probabilistic gradient computation triggered by a coin flip in each iteration, ensuring simpler proofs, efficient hyperparameter selection, and sharp convergence guarantees. Using R-PAGE as a framework for non-convex Riemannian optimization, we demonstrate its applicability to various important settings. For example, we derive Riemannian MARINA (R-MARINA) for distributed settings with communication compression, providing the best theoretical communication complexity guarantees for non-convex distributed optimization over Riemannian manifolds. Experimental results support our theoretical findings.

Streamlining in the Riemannian Realm: Efficient Riemannian Optimization with Loopless Variance Reduction

TL;DR

Abstract

Paper Structure (23 sections, 28 theorems, 140 equations, 1 figure, 1 algorithm)

This paper contains 23 sections, 28 theorems, 140 equations, 1 figure, 1 algorithm.

Introduction
Contributions
Preliminaries
Riemannian LSVRG
Riemannian PAGE
The Riemannian PAGE gradient estimator
Convergence in Non-Convex Finite-Sum Setting
Convergence in Non-Convex Online Setting
Riemannian MARINA
Experiments
Conclusion
Extended Related Work
Riemannian Optimization
Variance Reduction
Communication Compression
...and 8 more sections

Key Result

Lemma 1

If $a$, $b$, and $c$ represent the sides (i.e., side lengths) of a geodesic triangle in an Alexandrov space with curvature lower bounded by $\kappa_{\min}$, and $A$ denotes the angle between sides $b$ and $c$, then the following distance bound holds:

Figures (1)

Figure 1: Comparison of the R-LSVRG and R-SVRG methods: Left panel illustrates convergence in terms of gradient norm, while the right panel depicts convergence in terms of function values.

Theorems & Definitions (56)

Definition 1: Riemannian gradient
Lemma 1: zhang2016first Lemma 6
Corollary 1
Definition 2: Curvature-driven manifold term
Theorem 1
Corollary 2
Corollary 3
Theorem 2
Corollary 4
Corollary 5
...and 46 more

Streamlining in the Riemannian Realm: Efficient Riemannian Optimization with Loopless Variance Reduction

TL;DR

Abstract

Streamlining in the Riemannian Realm: Efficient Riemannian Optimization with Loopless Variance Reduction

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (1)

Theorems & Definitions (56)