Table of Contents
Fetching ...

Proximal Quasi-Newton Method for Composite Optimization over the Stiefel Manifold

Qinsi Wang, Wei Hong Yang

TL;DR

This work extends proximal Newton-type methods to Riemannian settings by proposing ManPQN, a proximal quasi-Newton method for composite optimization over the Stiefel manifold. By employing a damped LBFGS-like update with a diagonal approximation and a nonmonotone line search, ManPQN achieves global convergence with a quantified iteration complexity and local linear convergence under mild Hessian assumptions. Across CM, Sparse PCA, and Joint Diagonalization, ManPQN markedly accelerates convergence and reduces CPU time relative to state-of-the-art Riemannian proximal gradient methods, particularly on large-scale problems. The approach broadens the toolkit for efficient optimization on manifolds and offers practical speedups for structured matrix problems in machine learning and signal processing.

Abstract

In this paper, we consider the composite optimization problems over the Stiefel manifold. A successful method to solve this class of problems is the proximal gradient method proposed by Chen et al. Motivated by the proximal Newton-type techniques in the Euclidean space, we present a Riemannian proximal quasi-Newton method, named ManPQN, to solve the composite optimization problems. The global convergence of the ManPQN method is proved and iteration complexity for obtaining an $ε$-stationary point is analyzed. Under some mild conditions, we also establish the local linear convergence result of the ManPQN method. Numerical results are encouraging, which shows that the proximal quasi-Newton technique can be used to accelerate the proximal gradient method.

Proximal Quasi-Newton Method for Composite Optimization over the Stiefel Manifold

TL;DR

This work extends proximal Newton-type methods to Riemannian settings by proposing ManPQN, a proximal quasi-Newton method for composite optimization over the Stiefel manifold. By employing a damped LBFGS-like update with a diagonal approximation and a nonmonotone line search, ManPQN achieves global convergence with a quantified iteration complexity and local linear convergence under mild Hessian assumptions. Across CM, Sparse PCA, and Joint Diagonalization, ManPQN markedly accelerates convergence and reduces CPU time relative to state-of-the-art Riemannian proximal gradient methods, particularly on large-scale problems. The approach broadens the toolkit for efficient optimization on manifolds and offers practical speedups for structured matrix problems in machine learning and signal processing.

Abstract

In this paper, we consider the composite optimization problems over the Stiefel manifold. A successful method to solve this class of problems is the proximal gradient method proposed by Chen et al. Motivated by the proximal Newton-type techniques in the Euclidean space, we present a Riemannian proximal quasi-Newton method, named ManPQN, to solve the composite optimization problems. The global convergence of the ManPQN method is proved and iteration complexity for obtaining an -stationary point is analyzed. Under some mild conditions, we also establish the local linear convergence result of the ManPQN method. Numerical results are encouraging, which shows that the proximal quasi-Newton technique can be used to accelerate the proximal gradient method.
Paper Structure (15 sections, 10 theorems, 111 equations, 12 figures, 10 tables, 1 algorithm)

This paper contains 15 sections, 10 theorems, 111 equations, 12 figures, 10 tables, 1 algorithm.

Key Result

Proposition 2.1

(2order_boudness2016) Suppose $\mathcal{M}$ is a compact embedded submanifold of an Euclidean space $E$, and ${\bf R}$ is a retraction. Then there exists $M_1, M_2>0$ such that for all $X\in\mathcal{M}$ and for all $\xi\in{\rm T}_X\mathcal{M}$,

Figures (12)

  • Figure 1: Comparison on CM problem, different $n=\{64,128,256,512\}$ with $r= 4$ and $\mu=0.1$.
  • Figure 2: Comparison on CM problem, different $r=\{1,2,4,6,8\}$ with $n= 128$ and $\mu=0.15$.
  • Figure 3: Comparison on CM problem, different $\mu=\{0.05,0.10,0.15,0.20,0.25\}$ with $n= 128$ and $r=4$.
  • Figure 4: Comparison on Sparse PCA problem, different $n=\{100,200, 500, 800, 1000, 1500\}$ with $r= 5$ and $\mu=0.8$.
  • Figure 5: Comparison on Sparse PCA problem, different $r=\{1,2,4,8,10\}$ with $n= 800$ and $\mu=0.6$.
  • ...and 7 more figures

Theorems & Definitions (27)

  • Definition 2.1
  • Definition 2.2
  • Proposition 2.1
  • Definition 2.3
  • Definition 2.4: Generalized Calrke subdifferential mashiqian2020ywh2013
  • Definition 2.5: Regular function ywh2013
  • Lemma 3.1
  • proof
  • Lemma 4.1
  • proof
  • ...and 17 more