A Theory of the NEPv Approach for Optimization On the Stiefel Manifold

Ren-Cang Li

A Theory of the NEPv Approach for Optimization On the Stiefel Manifold

Ren-Cang Li

TL;DR

The paper develops two complementary frameworks, NPDo and NEPv, to solve optimization problems on the Stiefel manifold ${\rm St}(k,n)$ by transforming the KKT conditions into structured polar or eigenvalue problems and solving via self-consistent-field iterations. Central to both approaches is the notion of atomic functions, which are trace- or power-based building blocks whose properties ensure monotone ascent and global convergence under a unifying Ansatz. By proving that common matrix-trace objectives (and their convex compositions) fit into these atomic classes, the authors establish broad, provable guarantees for a wide range of problems in machine learning and data analysis. The frameworks also discuss acceleration via LOCG and practical considerations, highlighting that NEPv typically requires weaker conditions and covers more problem classes than NPDo, though NPDo often offers implementation advantages. Altogether, the work provides a comprehensive, unified theory that broadens the applicability and reliability of SCF-based optimization on the Stiefel manifold across many applications.

Abstract

The NEPv approach has been increasingly used lately for optimization on the Stiefel manifold arising from machine learning. General speaking, the approach first turns the first order optimality condition, also known as the KKT condition, into a nonlinear eigenvalue problem with eigenvector dependency (NEPv) or a nonlinear polar decomposition with orthogonal factor dependency (NPDo) and then solve the nonlinear problem via some variations of the self-consistent-field (SCF) iteration. The difficulty, however, lies in designing a proper SCF iteration so that a maximizer is found at the end. Currently, each use of the approach is very much individualized, especially in its convergence analysis to show that the approach does work or otherwise. In this paper, a unifying framework is established. The framework is built upon some basic assumptions. If the basic assumptions are satisfied, globally convergence is guaranteed to a stationary point and during the SCF iterative process that leads to the stationary point, the objective function increases monotonically. Also a notion of atomic functions is proposed, which include commonly used matrix traces of linear and quadratic forms as special ones. It is shown that the basic assumptions are satisfied by atomic functions and by convex compositions of atomic functions. Together they provide a large collection of objectives for which the NEPv/NPDo approach is guaranteed to work.

A Theory of the NEPv Approach for Optimization On the Stiefel Manifold

TL;DR

The paper develops two complementary frameworks, NPDo and NEPv, to solve optimization problems on the Stiefel manifold

by transforming the KKT conditions into structured polar or eigenvalue problems and solving via self-consistent-field iterations. Central to both approaches is the notion of atomic functions, which are trace- or power-based building blocks whose properties ensure monotone ascent and global convergence under a unifying Ansatz. By proving that common matrix-trace objectives (and their convex compositions) fit into these atomic classes, the authors establish broad, provable guarantees for a wide range of problems in machine learning and data analysis. The frameworks also discuss acceleration via LOCG and practical considerations, highlighting that NEPv typically requires weaker conditions and covers more problem classes than NPDo, though NPDo often offers implementation advantages. Altogether, the work provides a comprehensive, unified theory that broadens the applicability and reliability of SCF-based optimization on the Stiefel manifold across many applications.

Abstract

Paper Structure (33 sections, 51 theorems, 271 equations, 6 tables, 4 algorithms)

This paper contains 33 sections, 51 theorems, 271 equations, 6 tables, 4 algorithms.

Introduction
Review of the NEPv and NPDo Approach
Contributions
Organization and Notation
KKT Condition
The NPDo Approach
The NPDo Framework
The NPDo Ansatz
SCF Iteration and Convergence
Acceleration by LOCG and Convergence
Atomic Functions for NPDo
Conditions on Atomic Functions
Concrete Atomic Functions
Convex Composition
The NEPv Approach
...and 18 more sections

Key Result

Theorem 3.1

Let $P_*\in{\rm St}(k,n)$ be a maximizer of eq:main-opt. Suppose that the NPDo Ansatz holds and $P_*\in\mathbb{P}$. Then eq:KKT holds for $P=P_*$ and $\Lambda=\Lambda_*:=P_*^{\mathop{\mathrm{T}}\nolimits}\mathscr{H}(P_*)\succeq 0$.

Theorems & Definitions (115)

Definition 2.1
Remark 2.1
Example 3.1
Remark 3.1
Theorem 3.1
proof
Theorem 3.2
proof
Corollary 3.1
Theorem 3.3
...and 105 more

A Theory of the NEPv Approach for Optimization On the Stiefel Manifold

TL;DR

Abstract

A Theory of the NEPv Approach for Optimization On the Stiefel Manifold

Authors

TL;DR

Abstract

Table of Contents

Key Result

Theorems & Definitions (115)