Continuous-time Riemannian SGD and SVRG Flows on Wasserstein Probabilistic Space
Mingyang Yi, Bohan Wang
TL;DR
The paper addresses optimization over probability measures in the second-order Wasserstein space by developing continuous stochastic Riemannian flows. It builds three flows—Riemannian GD, SGD, and SVRG—by mapping discrete Riemannian dynamics to Euclidean SDEs and describing their evolution via the Fokker-Planck equation, all while minimizing $D_{KL}(\pi||\mu)$. The authors prove convergence rates that align with Euclidean theory: $O(1/\sqrt{T})$ for Riemannian SGD and $O(N^{2/3}/\epsilon)$ for Riemannian SVRG, with global rates $O(1/T)$ and exponential decay under a log-Sobolev (Riemannian PL) inequality, respectively. They also connect these flows to Langevin-type sampling, establish discrete-to-continuous correspondences (including SGLD and SVRG-Langevin), and validate the theory through Gaussian and mixture-Gaussian experiments. Overall, the work provides a principled framework for continuous stochastic Riemannian optimization on Wasserstein space and offers analytical insights for analyzing discrete stochastic Riemannian algorithms.
Abstract
Recently, optimization on the Riemannian manifold have provided valuable insights to the optimization community. In this regard, extending these methods to to the Wasserstein space is of particular interest, since optimization on Wasserstein space is closely connected to practical sampling processes. Generally, the standard (continuous) optimization method on Wasserstein space is Riemannian gradient flow (i.e., Langevin dynamics when minimizing KL divergence). In this paper, we aim to enrich the family of continuous optimization methods in the Wasserstein space, by extending the gradient flow on it into the stochastic gradient descent (SGD) flow and stochastic variance reduction gradient (SVRG) flow. By leveraging the property of Wasserstein space, we construct stochastic differential equations (SDEs) to approximate the corresponding discrete Euclidean dynamics of the desired Riemannian stochastic methods. Then, we obtain the flows in Wasserstein space by Fokker-Planck equation. Finally, we establish convergence rates of the proposed stochastic flows, which align with those known in the Euclidean setting.
