Accuracy and Stability of CUR decompositions with Oversampling
Taejun Park, Yuji Nakatsukasa
TL;DR
This paper analyzes the CUR decomposition with oversampling, focusing on CURCA which uses the core $U = A(I,J)$ and its stabilized variant SCURCA created via an $\epsilon$-pseudoinverse. It proves relative error bounds for CURCA and SCURCA, and establishes backward-stable computations under rounding errors when the row/column indices are chosen to well-approximate the dominant subspaces, with oversampling improving conditioning. A deterministic CS-decomposition–inspired oversampling algorithm is proposed to increase the minimum singular value of the row-submatrix, balancing accuracy and stability. Numerical experiments show that oversampling one side (rows or columns) and using a stable CURCA implementation yield reliable, accurate low-rank approximations, while independent selection of rows and columns can be unstable unless oversampling is employed. Overall, the work provides practical guidelines for stable CUR-based approximations with oversampling in large-scale settings.
Abstract
This work investigates the accuracy and numerical stability of CUR decompositions with oversampling. The CUR decomposition approximates a matrix using a subset of columns and rows of the matrix. When the number of columns and the rows are the same, the CUR decomposition can become unstable and less accurate due to the presence of the matrix inverse in the core matrix. Nevertheless, we demonstrate that the CUR decomposition can be implemented in a numerical stable manner and illustrate that oversampling, which increases either the number of columns or rows in the CUR decomposition, can enhance its accuracy and stability. Additionally, this work devises an algorithm for oversampling motivated by the theory of the CUR decomposition and the cosine-sine decomposition, whose competitiveness is illustrated through experiments.
