New perturbation bounds for low rank approximation of matrices via contour analysis
Phuc Tran, Van Vu
TL;DR
This paper develops a contour-analysis framework to bound the perturbation of the best rank-$p$ approximation under noise, introducing skewness parameters $x$ and $y$ that quantify how the noise interacts with the singular vectors of the ground matrix. The main deterministic result shows a bound $\|\tilde{A}_p-A_p\| \le 32\,\sigma_p\left(\frac{\|E\|}{\sigma_p}+\frac{rx}{\delta_p}+\frac{r^2y}{\sigma_p\delta_p}\right)$ under a mild coupling condition, and a stronger variant using a halved-rank notion $r(p)$ to weaken the low-rank assumption. The framework yields sharper, dimension-aware bounds compared to classical EY and DK results, especially when the noise aligns weakly with the top singular vectors, and it extends to symmetric cases with a symmetric bound involving $\delta_S$. An application to low-rank recovery with missing and noisy entries demonstrates the practical relevance, while the appendices extend the results to broader random-noise settings and motivate the approach with classical models like matrix recovery, the spiked model, SBM, and Gaussian mixtures.
Abstract
Let $A$ be an $m \times n$ matrix with rank $r$ and spectral decomposition $A = \sum _{i=1}^r σ_i u_i v_i^\top, $ where $σ_i$ are its singular values, ordered decreasingly, and $u_i, v_i$ are the corresponding left and right singular vectors. For a parameter $1 \le p \le r$, $A_p := \sum_{i=1}^p σ_i u_i v_i^\top$ is the best rank $p$ approximation of $A$. In practice, one often chooses $p$ to be small, leading to the commonly used phrase "low-rank approximation". Low-rank approximation plays a central role in data science because it can substantially reduce the dimensionality of the original data, the matrix $A$. For a large data matrix $A$, one typically computes a rank-$p$ approximation $A_p$ for a suitably chosen small $p$, stores $A_p$, and uses it as input for further computations. The reduced dimension of $A_p$ enables faster computations and significant data compression. In practice, noise is inevitable. We often have access only to noisy data $\tilde A = A + E$, where $E$ represents the noise. Consequently, the low-rank approximation used as input in many downstream tasks is $\tilde A_p$, the best rank $p$ approximation of $\tilde A$, rather than $A_p$. Therefore, it is natural and important to estimate the error $ \| \tilde A_p - A_p \|$. In this paper, we develop a novel method (based on contour analysis) to bound $\| \tilde A_p - A_p \|$. We introduce new parameters that measure the skewness between the noise matrix $E$ and the singular vectors of $A$, and exploit these to obtain notable improvements, compared to classical approaches in the literature (using Eckart-Young-Mirsky theorem or Davis-Kahan theorem), in many settings. This method is of independent interest and has many further applications.
