Table of Contents
Fetching ...

Accuracy and Stability of CUR decompositions with Oversampling

Taejun Park, Yuji Nakatsukasa

TL;DR

This paper analyzes the CUR decomposition with oversampling, focusing on CURCA which uses the core $U = A(I,J)$ and its stabilized variant SCURCA created via an $\epsilon$-pseudoinverse. It proves relative error bounds for CURCA and SCURCA, and establishes backward-stable computations under rounding errors when the row/column indices are chosen to well-approximate the dominant subspaces, with oversampling improving conditioning. A deterministic CS-decomposition–inspired oversampling algorithm is proposed to increase the minimum singular value of the row-submatrix, balancing accuracy and stability. Numerical experiments show that oversampling one side (rows or columns) and using a stable CURCA implementation yield reliable, accurate low-rank approximations, while independent selection of rows and columns can be unstable unless oversampling is employed. Overall, the work provides practical guidelines for stable CUR-based approximations with oversampling in large-scale settings.

Abstract

This work investigates the accuracy and numerical stability of CUR decompositions with oversampling. The CUR decomposition approximates a matrix using a subset of columns and rows of the matrix. When the number of columns and the rows are the same, the CUR decomposition can become unstable and less accurate due to the presence of the matrix inverse in the core matrix. Nevertheless, we demonstrate that the CUR decomposition can be implemented in a numerical stable manner and illustrate that oversampling, which increases either the number of columns or rows in the CUR decomposition, can enhance its accuracy and stability. Additionally, this work devises an algorithm for oversampling motivated by the theory of the CUR decomposition and the cosine-sine decomposition, whose competitiveness is illustrated through experiments.

Accuracy and Stability of CUR decompositions with Oversampling

TL;DR

This paper analyzes the CUR decomposition with oversampling, focusing on CURCA which uses the core and its stabilized variant SCURCA created via an -pseudoinverse. It proves relative error bounds for CURCA and SCURCA, and establishes backward-stable computations under rounding errors when the row/column indices are chosen to well-approximate the dominant subspaces, with oversampling improving conditioning. A deterministic CS-decomposition–inspired oversampling algorithm is proposed to increase the minimum singular value of the row-submatrix, balancing accuracy and stability. Numerical experiments show that oversampling one side (rows or columns) and using a stable CURCA implementation yield reliable, accurate low-rank approximations, while independent selection of rows and columns can be unstable unless oversampling is employed. Overall, the work provides practical guidelines for stable CUR-based approximations with oversampling in large-scale settings.

Abstract

This work investigates the accuracy and numerical stability of CUR decompositions with oversampling. The CUR decomposition approximates a matrix using a subset of columns and rows of the matrix. When the number of columns and the rows are the same, the CUR decomposition can become unstable and less accurate due to the presence of the matrix inverse in the core matrix. Nevertheless, we demonstrate that the CUR decomposition can be implemented in a numerical stable manner and illustrate that oversampling, which increases either the number of columns or rows in the CUR decomposition, can enhance its accuracy and stability. Additionally, this work devises an algorithm for oversampling motivated by the theory of the CUR decomposition and the cosine-sine decomposition, whose competitiveness is illustrated through experiments.
Paper Structure (19 sections, 12 theorems, 85 equations, 4 figures, 2 algorithms)

This paper contains 19 sections, 12 theorems, 85 equations, 4 figures, 2 algorithms.

Key Result

Lemma 3.1

\newlabellemma:obproj0 Let $\mathcal{P}_{X,Y} \in \mathbb{R}^{n\times n}$ be a projector where $X\in \mathbb{R}^{n\times k}$, $Y\in \mathbb{R}^{n\times \ell}$ and $Y^TX\in \mathbb{R}^{\ell\times k}$ all have full column rank (so $k\leq \ell$). Then for any unitarily invariant norm $\left\lVert\cdot\right\rVert$ where $Q_X$ is an orthonormal matrix spanning the columns of $X$.

Figures (4)

  • Figure 1: Tests for different implementations of the CURCA. The third and fourth implementation performs stably. We recommend the third implementation.
  • Figure 1: The effect of oversampling for the CURBA. We use pivoting on a random sketch (Algorithm \ref{['alg:pivsketch']}) to obtain the initial set of indices. Then we oversample the row indices using the three oversampling algorithms, $\mathtt{OS\!+\!P}$, $\mathtt{OS\!+\!L}$ and $\mathtt{OS\!+\!E}$. We also oversample column indices in Figure \ref{['subfig:YaleBA']}.
  • Figure 2: Relationship between the rows and columns in the CURCA. In Case $1$, the rows and columns are chosen independently from one another and in Case $2$, we select the columns first and then the rows were computed from the selected columns. Cases $3$ and $4$ correspond to Cases $1$ and $2$ with row oversampling ($p = k$), respectively. When the rows and columns are chosen independently (Case $1$), the resulting approximation can be catastrophic.
  • Figure 3: Comparison of different oversampling methods for various test matrices. We use pivoting on a random sketch to obtain the initial set of indices except for Figure \ref{['subfig:CIFARCAuni']} where we use uniform sampling. Then we oversample the row indices using the three oversampling algorithms, $\mathtt{OS\!+\!P}$, $\mathtt{OS\!+\!L}$ and $\mathtt{OS\!+\!E}$. We also oversample column indices in Figure \ref{['subfig:YaleCA']} to demonstrate that oversampling both row and column indices can be harmful.

Theorems & Definitions (23)

  • Lemma 3.1
  • Proof 1
  • Theorem 3.3
  • Proof 2
  • Corollary 3.4
  • Proof 3
  • Remark 3.5
  • Lemma 3.6
  • Lemma 3.7
  • Theorem 3.8
  • ...and 13 more