Table of Contents
Fetching ...

Convergence analysis of the transformed gradient projection algorithms on compact matrix manifolds

Wentao Ding, Jianze Li, Shuzhong Zhang

TL;DR

The paper addresses optimization of a Lipschitz-gradient objective on compact matrix manifolds, focusing on projection-based line-search methods whose theory lags behind retraction-based approaches. It introduces the Transformed Gradient Projection (TGP) framework, which augments gradient information with left/right scaling and a normal component to unify and extend many existing ProjLS and RetrLS methods. The authors establish weak convergence, iteration complexity, and global convergence under Armijo, Zhang-Hager nonmonotone Armijo, and fixed stepsizes, supported by new geometric inequalities for projections onto Stiefel and Grassmann manifolds and Łojasiewicz-based arguments. Through extensive numerical experiments, they demonstrate that scaling choices and the normal component critically influence performance, enabling superior results in several cases compared to classical RGD/EGP or Shifted PM methods. The framework not only subsumes known algorithms but also yields new variants (e.g., TGP-A-Eigen) with practical benefits, providing a versatile approach for manifold optimization with orthogonality constraints.

Abstract

In this paper, we study the optimization problem on a compact matrix manifold. While existing feasible algorithms can be broadly categorized into retraction-based and projection-based methods, compared to the more comprehensive and in-depth algorithmic and convergence research framework for retraction-based line-search (RetrLS) algorithms using only tangent vectors, the theoretical understanding and algorithmic design of projection-based line-search (ProjLS) algorithms remain limited, especially when general search directions and stepsizes are involved. To bridge this gap, we propose a unified algorithmic framework called the Transformed Gradient Projection (TGP) algorithm. The key idea is to construct the search direction as a transformed Riemannian (or Euclidean) gradient augmented by an additional normal component, allowing the framework to encompass and generalize numerous existing algorithms. Then, we conduct a thorough exploration of the convergence properties of the TGP algorithms under various stepsizes, including the Armijo, Zhang-Hager type nonmonotone Armijo, and fixed stepsizes. To achieve this, we extensively analyze the geometric properties of the projection onto compact matrix manifolds, which may be of independent interest. Building upon these insights, we establish the weak convergence, iteration complexity, and global convergence of TGP algorithms under three distinct stepsizes. In cases where the compact matrix manifold is the Stiefel or Grassmann manifold, our convergence results either encompass or surpass those found in the literature. Finally, through a series of numerical experiments and theoretical analysis, we observe that different choices of scaling matrices and normal components in the search direction of TGP algorithms can lead to significantly different performance in practice.

Convergence analysis of the transformed gradient projection algorithms on compact matrix manifolds

TL;DR

The paper addresses optimization of a Lipschitz-gradient objective on compact matrix manifolds, focusing on projection-based line-search methods whose theory lags behind retraction-based approaches. It introduces the Transformed Gradient Projection (TGP) framework, which augments gradient information with left/right scaling and a normal component to unify and extend many existing ProjLS and RetrLS methods. The authors establish weak convergence, iteration complexity, and global convergence under Armijo, Zhang-Hager nonmonotone Armijo, and fixed stepsizes, supported by new geometric inequalities for projections onto Stiefel and Grassmann manifolds and Łojasiewicz-based arguments. Through extensive numerical experiments, they demonstrate that scaling choices and the normal component critically influence performance, enabling superior results in several cases compared to classical RGD/EGP or Shifted PM methods. The framework not only subsumes known algorithms but also yields new variants (e.g., TGP-A-Eigen) with practical benefits, providing a versatile approach for manifold optimization with orthogonality constraints.

Abstract

In this paper, we study the optimization problem on a compact matrix manifold. While existing feasible algorithms can be broadly categorized into retraction-based and projection-based methods, compared to the more comprehensive and in-depth algorithmic and convergence research framework for retraction-based line-search (RetrLS) algorithms using only tangent vectors, the theoretical understanding and algorithmic design of projection-based line-search (ProjLS) algorithms remain limited, especially when general search directions and stepsizes are involved. To bridge this gap, we propose a unified algorithmic framework called the Transformed Gradient Projection (TGP) algorithm. The key idea is to construct the search direction as a transformed Riemannian (or Euclidean) gradient augmented by an additional normal component, allowing the framework to encompass and generalize numerous existing algorithms. Then, we conduct a thorough exploration of the convergence properties of the TGP algorithms under various stepsizes, including the Armijo, Zhang-Hager type nonmonotone Armijo, and fixed stepsizes. To achieve this, we extensively analyze the geometric properties of the projection onto compact matrix manifolds, which may be of independent interest. Building upon these insights, we establish the weak convergence, iteration complexity, and global convergence of TGP algorithms under three distinct stepsizes. In cases where the compact matrix manifold is the Stiefel or Grassmann manifold, our convergence results either encompass or surpass those found in the literature. Finally, through a series of numerical experiments and theoretical analysis, we observe that different choices of scaling matrices and normal components in the search direction of TGP algorithms can lead to significantly different performance in practice.
Paper Structure (10 sections, 12 equations)