A general framework of Riemannian adaptive optimization methods with a convergence analysis
Hiroyuki Sakai, Hideaki Iiduka
TL;DR
The paper addresses stochastic optimization on Riemannian manifolds by introducing a general framework for adaptive methods on embedded submanifolds of $\,\mathbb{R}^d$, unifying algorithms such as SGD, AdaGrad, RMSProp, Adam, and AMSGrad via tangent-space projections. It presents RAMSGrad as a direct extension of AMSGrad to embedded submanifolds and provides convergence analyses for both constant and diminishing step sizes, including scenarios with increasing mini-batch sizes; the rates scale as $\mathcal{O}\left(\frac{1}{K}+\frac{1}{b}\right)$ for constant steps and $\mathcal{O}\left(\left(1+\frac{1}{b}\right)\frac{\log K}{\sqrt{K}}\right)$ for diminishing steps, with improvements when $b_k$ grows. The theoretical framework hinges on projecting adaptive updates onto the tangent spaces via $P_x$ and leveraging retraction-Lipschitz smoothness to establish descent. Numerical experiments on PCA (Stiefel) and LRMC (Grassmann) datasets demonstrate RAMSGrad and RAdam competitive performance, validating both the convergence theory and practical effectiveness on Riemannian optimization problems.
Abstract
This paper proposes a general framework of Riemannian adaptive optimization methods. The framework encapsulates several stochastic optimization algorithms on Riemannian manifolds and incorporates the mini-batch strategy that is often used in deep learning. Within this framework, we also propose AMSGrad on embedded submanifolds of Euclidean space. Moreover, we give convergence analyses valid for both a constant and a diminishing step size. Our analyses also reveal the relationship between the convergence rate and mini-batch size. In numerical experiments, we applied the proposed algorithm to principal component analysis and the low-rank matrix completion problem, which can be considered to be Riemannian optimization problems. Python implementations of the methods used in the numerical experiments are available at https://github.com/iiduka-researches/202408-adaptive.
