Quantitative Convergences of Lie Group Momentum Optimizers
Lingkai Kong, Molei Tao
TL;DR
The paper develops two Lie-group momentum optimizers, Lie Heavy-Ball and Lie NAG-SC, derived via variational principles and left-trivialization to operate entirely with the gradient oracle and the exponential map. Under $L$-smoothness and local geodesic-$\mu$-strong convexity, Lie Heavy-Ball achieves a non-accelerated linear rate while Lie NAG-SC attains acceleration with a $\sqrt{\kappa}$-type dependence, where $\kappa=L/\mu$, albeit with a curvature term $p(a)$ reflecting the Lie-group geometry. The discretizations stay on the manifold via splitting and Euclidean-inspired momentum, avoiding costly operations like the logarithm map and parallel transport, which improves practicality on Lie groups such as $\mathsf{SO}(n)$. Theoretical results are supported by systematic numerical tests on eigenvalue decomposition problems, showing that Lie-NAG-SC outperforms Lie Heavy-Ball on ill-conditioned tasks and validating the proposed rates and role of curvature.
Abstract
Explicit, momentum-based dynamics that optimize functions defined on Lie groups can be constructed via variational optimization and momentum trivialization. Structure preserving time discretizations can then turn this dynamics into optimization algorithms. This article investigates two types of discretization, Lie Heavy-Ball, which is a known splitting scheme, and Lie NAG-SC, which is newly proposed. Their convergence rates are explicitly quantified under $L$-smoothness and local strong convexity assumptions. Lie NAG-SC provides acceleration over the momentumless case, i.e. Riemannian gradient descent, but Lie Heavy-Ball does not. When compared to existing accelerated optimizers for general manifolds, both Lie Heavy-Ball and Lie NAG-SC are computationally cheaper and easier to implement, thanks to their utilization of group structure. Only gradient oracle and exponential map are required, but not logarithm map or parallel transport which are computational costly.
