Table of Contents
Fetching ...

ADMM for Nonsmooth Composite Optimization under Orthogonality Constraints

Ganzhao Yuan

TL;DR

This work proposes an Alternating Direction Method of Multipliers (ADMM), an Alternating Direction Method of Multipliers designed to solve a class of structured, nonconvex, nonsmooth optimization problems under orthogonality constraints, where the objectives combine a smooth function, a nonsmooth concave function, and a nonsmooth weakly convex function.

Abstract

We consider a class of structured, nonconvex, nonsmooth optimization problems under orthogonality constraints, where the objectives combine a smooth function, a nonsmooth concave function, and a nonsmooth weakly convex function. This class of problems finds diverse applications in statistical learning and data science. Existing methods for addressing these problems often fail to exploit the specific structure of orthogonality constraints, struggle with nonsmooth functions, or result in suboptimal oracle complexity. We propose {\sf OADMM}, an Alternating Direction Method of Multipliers (ADMM) designed to solve this class of problems using efficient proximal linearized strategies. Two specific variants of {\sf OADMM} are explored: one based on Euclidean Projection ({\sf OADMM-EP}) and the other on Riemannian Retraction ({\sf OADMM-RR}). Under mild assumptions, we prove that {\sf OADMM} converges to a critical point of the problem with an ergodic convergence rate of $\mathcal{O}(1/ε^{3})$. Additionally, we establish a polynomial convergence rate or super-exponential convergence rate for {\sf OADMM}, depending on the specific setting, under the Kurdyka-Lojasiewicz (KL) inequality. To the best of our knowledge, this is \textit{the first non-ergodic convergence result} for this class of nonconvex nonsmooth optimization problems. Numerical experiments demonstrate that the proposed algorithm achieves state-of-the-art performance. \textbf{Keywords:} Orthogonality Constraints; Nonconvex Optimization; Nonsmooth Composite Optimization; ADMM; Convergence Analysis

ADMM for Nonsmooth Composite Optimization under Orthogonality Constraints

TL;DR

This work proposes an Alternating Direction Method of Multipliers (ADMM), an Alternating Direction Method of Multipliers designed to solve a class of structured, nonconvex, nonsmooth optimization problems under orthogonality constraints, where the objectives combine a smooth function, a nonsmooth concave function, and a nonsmooth weakly convex function.

Abstract

We consider a class of structured, nonconvex, nonsmooth optimization problems under orthogonality constraints, where the objectives combine a smooth function, a nonsmooth concave function, and a nonsmooth weakly convex function. This class of problems finds diverse applications in statistical learning and data science. Existing methods for addressing these problems often fail to exploit the specific structure of orthogonality constraints, struggle with nonsmooth functions, or result in suboptimal oracle complexity. We propose {\sf OADMM}, an Alternating Direction Method of Multipliers (ADMM) designed to solve this class of problems using efficient proximal linearized strategies. Two specific variants of {\sf OADMM} are explored: one based on Euclidean Projection ({\sf OADMM-EP}) and the other on Riemannian Retraction ({\sf OADMM-RR}). Under mild assumptions, we prove that {\sf OADMM} converges to a critical point of the problem with an ergodic convergence rate of . Additionally, we establish a polynomial convergence rate or super-exponential convergence rate for {\sf OADMM}, depending on the specific setting, under the Kurdyka-Lojasiewicz (KL) inequality. To the best of our knowledge, this is \textit{the first non-ergodic convergence result} for this class of nonconvex nonsmooth optimization problems. Numerical experiments demonstrate that the proposed algorithm achieves state-of-the-art performance. \textbf{Keywords:} Orthogonality Constraints; Nonconvex Optimization; Nonsmooth Composite Optimization; ADMM; Convergence Analysis
Paper Structure (41 sections, 44 theorems, 132 equations, 4 figures, 1 table, 1 algorithm)

This paper contains 41 sections, 44 theorems, 132 equations, 4 figures, 1 table, 1 algorithm.

Key Result

Lemma 2.2

(BohmW21) Let $h: \mathbb{R}^{m} \mapsto \mathbb{R}$ to be a proper, $W_h$-weakly convex, and lower semicontinuous function. Assume $\mu\in(0,W_h^{-1})$. We have the following results. The function $h_{\mu}(\cdot)$ is continuously differentiable with gradient $\nabla h_{\mu}(\mathbf{y}) = \frac{1}{\

Figures (4)

  • Figure 1: The convergence curve of the compared methods with $\dot{\rho}=50$.
  • Figure 2: The convergence curve of the compared methods with $\dot{\rho}=500$.
  • Figure 3: The convergence curve of the compared methods with $\dot{\rho}=10$.
  • Figure 5: The convergence curve of the compared methods with $\dot{\rho}=1000$.

Theorems & Definitions (99)

  • Definition 2.1
  • Lemma 2.2
  • Lemma 2.3
  • Lemma 2.4
  • Lemma 2.5
  • Remark 2.6
  • Definition 2.7
  • Definition 2.8
  • Remark 2.9
  • Lemma 2.10
  • ...and 89 more