A Riemannian Proximal Newton-CG Method

Wen Huang; Wutao Si

A Riemannian Proximal Newton-CG Method

Wen Huang, Wutao Si

TL;DR

This work addresses global convergence for nonsmooth optimization on Riemannian manifolds by marrying a truncated conjugate gradient solver with a Riemannian proximal Newton step, producing a RPN-CG method. It proves global convergence and local superlinear convergence under standard assumptions, and demonstrates that a hybrid RPN-CGH variant is robust to switching parameters. The approach shows superior practical performance on sparse PCA, CM, and CD problems compared to state-of-the-art proximal-gradient-type methods. Overall, the method provides a scalable, second-order algorithm for nonsmooth manifold optimization with strong theoretical guarantees and empirical efficiency.

Abstract

Recently, a Riemannian proximal Newton method has been developed for optimizing problems in the form of $\min_{x\in\mathcal{M}} f(x) + μ\|x\|_1$, where $\mathcal{M}$ is a compact embedded submanifold and $f(x)$ is smooth. Although this method converges superlinearly locally, global convergence is not guaranteed. The existing remedy relies on a hybrid approach: running a Riemannian proximal gradient method until the iterate is sufficiently accurate and switching to the Riemannian proximal Newton method. This existing approach is sensitive to the switching parameter. This paper proposes a Riemannian proximal Newton-CG method that merges the truncated conjugate gradient method with the Riemannian proximal Newton method. The global convergence and local superlinear convergence are proven. Numerical experiments show that the proposed method outperforms other state-of-the-art methods.

A Riemannian Proximal Newton-CG Method

TL;DR

Abstract

Recently, a Riemannian proximal Newton method has been developed for optimizing problems in the form of

, where

is a compact embedded submanifold and

is smooth. Although this method converges superlinearly locally, global convergence is not guaranteed. The existing remedy relies on a hybrid approach: running a Riemannian proximal gradient method until the iterate is sufficiently accurate and switching to the Riemannian proximal Newton method. This existing approach is sensitive to the switching parameter. This paper proposes a Riemannian proximal Newton-CG method that merges the truncated conjugate gradient method with the Riemannian proximal Newton method. The global convergence and local superlinear convergence are proven. Numerical experiments show that the proposed method outperforms other state-of-the-art methods.

Paper Structure (21 sections, 17 theorems, 70 equations, 4 figures, 4 tables, 3 algorithms)

This paper contains 21 sections, 17 theorems, 70 equations, 4 figures, 4 tables, 3 algorithms.

Introduction
Notation and Preliminaries
Riemannian manifold
The existing Riemannian proximal Newton method
A Riemannian Proximal Newton-CG Method
Global convergence analysis
Local convergence analysis
Reformulation of the Newton equation \ref{['3-3']}
Termination conditions of Algorithm \ref{['alg:tCG']}
Superlinear convergence analysis
Numerical Experiments
Tested problems, support estimation, parameter setting, and testing environment
Tested problems:
Support estimation:
Parameter setting:
...and 6 more sections

Key Result

Proposition 2.1

If $x_* = $ is a local minimizer with $\bar{x}_* \in \mathbb{R}^j$ and $\bar{B}_{x_*}$ has full column rank. Then $v(x_*) = 0$ and $\mathcal{B}_{x_*} \succeq 0$ on the subspace $\mathfrak{L}_{x_*}$, where $\mathfrak{L}_x$ is defined by $\mathfrak{L}_x = \{w: \bar{B}_{x}^{T} w = 0\}$.

Figures (4)

Figure 1: The five principal components used in the synthetic data.
Figure 2: Sparse PCA: plots of $\|v(x_k)\|$ versus iterations and CPU times respectively. The left two plots are generated by random data and the right two plots are generated by synthetic data with $(n, p, \mu) = (4000, 5, 0.8)$ and $\epsilon = 10^{-3}$.
Figure 3: CM: plots of $\|v(x_k)\|$ versus iterations and CPU times respectively.
Figure 4: Community Detection: plots of $\|v(x_k)\|$ versus iterations and CPU times respectively.

Theorems & Definitions (35)

Proposition 2.1
Remark 3.1
Lemma 3.1
proof
Lemma 3.2
proof
Theorem 3.1
proof
Definition 3.1: Geodesically strongly convex
Lemma 3.3
...and 25 more

A Riemannian Proximal Newton-CG Method

TL;DR

Abstract

A Riemannian Proximal Newton-CG Method

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (4)

Theorems & Definitions (35)