Local Linear Convergence of Infeasible Optimization with Orthogonal Constraints

Youbang Sun; Shixiang Chen; Alfredo Garcia; Shahin Shahrampour

Local Linear Convergence of Infeasible Optimization with Orthogonal Constraints

Youbang Sun, Shixiang Chen, Alfredo Garcia, Shahin Shahrampour

TL;DR

The paper addresses optimization with orthogonality constraints on the Stiefel manifold and the computational burden of retractions. It analyzes the infeasible, retraction-free landing algorithm and proves a local linear convergence rate under a local Riemannian Polyak-Łojasiewicz condition, facilitated by a carefully constructed merit function. The authors validate the theory with high-dimensional PCA and CNN training experiments, showing comparable convergence to retraction-based methods while substantially reducing per-iteration cost. This work demonstrates the practicality of retraction-free optimization for large-scale problems that involve orthogonality constraints.

Abstract

Many classical and modern machine learning algorithms require solving optimization tasks under orthogonality constraints. Solving these tasks with feasible methods requires a gradient descent update followed by a retraction operation on the Stiefel manifold, which can be computationally expensive. Recently, an infeasible retraction-free approach, termed the landing algorithm, was proposed as an efficient alternative. Motivated by the common occurrence of orthogonality constraints in tasks such as principle component analysis and training of deep neural networks, this paper studies the landing algorithm and establishes a novel linear convergence rate for smooth non-convex functions using only a local Riemannian PŁ condition. Numerical experiments demonstrate that the landing algorithm performs on par with the state-of-the-art retraction-based methods with substantially reduced computational overhead.

Local Linear Convergence of Infeasible Optimization with Orthogonal Constraints

TL;DR

Abstract

Local Linear Convergence of Infeasible Optimization with Orthogonal Constraints

TL;DR

Abstract

Paper Structure

Table of Contents

Key Result

Figures (1)

Theorems & Definitions (8)