Table of Contents
Fetching ...

Tight Error Bounds for the Sign-Constrained Stiefel Manifold

Xiaojun Chen, Yifan He, Zaikun Zhang

TL;DR

This work derives explicit global and local error bounds that relate the distance to sign-constrained Stiefel manifolds \\mathbb{S}^{n,r}_{S} to computable residuals, with constants \\nu and exponents \\tfrac{1}{2} or 1, and shows these bounds are tight in the regime 1 < r < n. It extends the nonnegative Stiefel results to general sign constraints, including special-case and general-case bounds, and establishes linear regularity characterizations. Leveraging these bounds, the authors develop exact penalty formulations for minimizing Lipschitz continuous objectives under orthogonality and sign constraints, detailing precise thresholds for penalty exponents and parameters. They validate the theory with synthetic experiments and Yale face data, demonstrating improved reconstruction quality when using the sign-constrained penalties. Overall, the results provide a rigorous, dimension-free toolkit for penalty methods and error analysis in constrained Stiefel-manifold optimization with signs.

Abstract

The sign-constrained Stiefel manifold in $\mathbb{R}^{n\times r}$ is a segment of the Stiefel manifold with fixed signs (nonnegative or nonpositive) for some columns of the matrices. It includes the nonnegative Stiefel manifold as a special case. We present global and local error bounds that provide an inequality with easily computable residual functions and explicit coefficients to bound the distance from matrices in $\mathbb{R}^{n\times r}$ to the sign-constrained Stiefel manifold. Moreover, we show that the error bounds cannot be improved except for the multiplicative constants under some mild conditions, which explains why two square-root terms are necessary in the bounds when $1< r <n$ and why the $\ell_1$ norm can be used in the bounds when $r = n$ or $r = 1$ for the sign constraints and orthogonality, respectively. The error bounds are applied to derive exact penalty methods for minimizing a Lipschitz continuous function with orthogonality and sign constraints.

Tight Error Bounds for the Sign-Constrained Stiefel Manifold

TL;DR

This work derives explicit global and local error bounds that relate the distance to sign-constrained Stiefel manifolds \\mathbb{S}^{n,r}_{S} to computable residuals, with constants \\nu and exponents \\tfrac{1}{2} or 1, and shows these bounds are tight in the regime 1 < r < n. It extends the nonnegative Stiefel results to general sign constraints, including special-case and general-case bounds, and establishes linear regularity characterizations. Leveraging these bounds, the authors develop exact penalty formulations for minimizing Lipschitz continuous objectives under orthogonality and sign constraints, detailing precise thresholds for penalty exponents and parameters. They validate the theory with synthetic experiments and Yale face data, demonstrating improved reconstruction quality when using the sign-constrained penalties. Overall, the results provide a rigorous, dimension-free toolkit for penalty methods and error analysis in constrained Stiefel-manifold optimization with signs.

Abstract

The sign-constrained Stiefel manifold in is a segment of the Stiefel manifold with fixed signs (nonnegative or nonpositive) for some columns of the matrices. It includes the nonnegative Stiefel manifold as a special case. We present global and local error bounds that provide an inequality with easily computable residual functions and explicit coefficients to bound the distance from matrices in to the sign-constrained Stiefel manifold. Moreover, we show that the error bounds cannot be improved except for the multiplicative constants under some mild conditions, which explains why two square-root terms are necessary in the bounds when and why the norm can be used in the bounds when or for the sign constraints and orthogonality, respectively. The error bounds are applied to derive exact penalty methods for minimizing a Lipschitz continuous function with orthogonality and sign constraints.
Paper Structure (18 sections, 28 theorems, 139 equations, 1 figure, 2 tables)

This paper contains 18 sections, 28 theorems, 139 equations, 1 figure, 2 tables.

Key Result

Lemma 1

For any matrices $X \in \mathbb{R}^{n\times r}$ and $Y\in\mathbb{R}^{n\times r}$, we have for any unitarily invariant norm $\|\cdot\|$ on $\mathbb{R}^{n\times r}$. When $\|\cdot\|$ is the Frobenius norm, the equality holds in eq:Mirsky if and only if there exist orthogonal matrices $U\in\mathbb{R}^{n\times n}$ and $V \in \mathbb{R}^{r \times r}$ such that $X = U\Sigma(X) V^{{\mathsf{T}}}

Figures (1)

  • Figure 1: Row $i$ ($i=1,\ldots, 5$) shows average values of RRE and PEV of the reconstructed matrix $\hat{A}$ using 11 images for the $i$th person by models (\ref{['trace']}) with $\lambda=1$ and (\ref{['traceP']}) with $\lambda=1, \mu=6$, respectively, for $r = 1, \cdots, 32$.

Theorems & Definitions (51)

  • Lemma 1: Mirsky
  • Lemma 2: von Neumann
  • Lemma 3: Fan-Hoffman
  • Lemma 4
  • proof
  • Lemma 5
  • Lemma 6
  • Theorem 1
  • proof
  • Proposition 1
  • ...and 41 more