New perturbation bounds for low rank approximation of matrices via contour analysis

Phuc Tran; Van Vu

New perturbation bounds for low rank approximation of matrices via contour analysis

Phuc Tran, Van Vu

TL;DR

This paper develops a contour-analysis framework to bound the perturbation of the best rank-$p$ approximation under noise, introducing skewness parameters $x$ and $y$ that quantify how the noise interacts with the singular vectors of the ground matrix. The main deterministic result shows a bound $\|\tilde{A}_p-A_p\| \le 32\,\sigma_p\left(\frac{\|E\|}{\sigma_p}+\frac{rx}{\delta_p}+\frac{r^2y}{\sigma_p\delta_p}\right)$ under a mild coupling condition, and a stronger variant using a halved-rank notion $r(p)$ to weaken the low-rank assumption. The framework yields sharper, dimension-aware bounds compared to classical EY and DK results, especially when the noise aligns weakly with the top singular vectors, and it extends to symmetric cases with a symmetric bound involving $\delta_S$. An application to low-rank recovery with missing and noisy entries demonstrates the practical relevance, while the appendices extend the results to broader random-noise settings and motivate the approach with classical models like matrix recovery, the spiked model, SBM, and Gaussian mixtures.

Abstract

Let $A$ be an $m \times n$ matrix with rank $r$ and spectral decomposition $A = \sum _{i=1}^r σ_i u_i v_i^\top, $ where $σ_i$ are its singular values, ordered decreasingly, and $u_i, v_i$ are the corresponding left and right singular vectors. For a parameter $1 \le p \le r$, $A_p := \sum_{i=1}^p σ_i u_i v_i^\top$ is the best rank $p$ approximation of $A$. In practice, one often chooses $p$ to be small, leading to the commonly used phrase "low-rank approximation". Low-rank approximation plays a central role in data science because it can substantially reduce the dimensionality of the original data, the matrix $A$. For a large data matrix $A$, one typically computes a rank-$p$ approximation $A_p$ for a suitably chosen small $p$, stores $A_p$, and uses it as input for further computations. The reduced dimension of $A_p$ enables faster computations and significant data compression. In practice, noise is inevitable. We often have access only to noisy data $\tilde A = A + E$, where $E$ represents the noise. Consequently, the low-rank approximation used as input in many downstream tasks is $\tilde A_p$, the best rank $p$ approximation of $\tilde A$, rather than $A_p$. Therefore, it is natural and important to estimate the error $ \| \tilde A_p - A_p \|$. In this paper, we develop a novel method (based on contour analysis) to bound $\| \tilde A_p - A_p \|$. We introduce new parameters that measure the skewness between the noise matrix $E$ and the singular vectors of $A$, and exploit these to obtain notable improvements, compared to classical approaches in the literature (using Eckart-Young-Mirsky theorem or Davis-Kahan theorem), in many settings. This method is of independent interest and has many further applications.

New perturbation bounds for low rank approximation of matrices via contour analysis

TL;DR

This paper develops a contour-analysis framework to bound the perturbation of the best rank-

approximation under noise, introducing skewness parameters

and

that quantify how the noise interacts with the singular vectors of the ground matrix. The main deterministic result shows a bound

under a mild coupling condition, and a stronger variant using a halved-rank notion

to weaken the low-rank assumption. The framework yields sharper, dimension-aware bounds compared to classical EY and DK results, especially when the noise aligns weakly with the top singular vectors, and it extends to symmetric cases with a symmetric bound involving

. An application to low-rank recovery with missing and noisy entries demonstrates the practical relevance, while the appendices extend the results to broader random-noise settings and motivate the approach with classical models like matrix recovery, the spiked model, SBM, and Gaussian mixtures.

Abstract

Let

be an

matrix with rank

and spectral decomposition

where

are its singular values, ordered decreasingly, and

are the corresponding left and right singular vectors. For a parameter

is the best rank

approximation of

. In practice, one often chooses

to be small, leading to the commonly used phrase "low-rank approximation". Low-rank approximation plays a central role in data science because it can substantially reduce the dimensionality of the original data, the matrix

. For a large data matrix

, one typically computes a rank-

approximation

for a suitably chosen small

, stores

, and uses it as input for further computations. The reduced dimension of

enables faster computations and significant data compression. In practice, noise is inevitable. We often have access only to noisy data

, where

represents the noise. Consequently, the low-rank approximation used as input in many downstream tasks is

, the best rank

approximation of

, rather than

. Therefore, it is natural and important to estimate the error

. In this paper, we develop a novel method (based on contour analysis) to bound

. We introduce new parameters that measure the skewness between the noise matrix

and the singular vectors of

, and exploit these to obtain notable improvements, compared to classical approaches in the literature (using Eckart-Young-Mirsky theorem or Davis-Kahan theorem), in many settings. This method is of independent interest and has many further applications.

New perturbation bounds for low rank approximation of matrices via contour analysis

TL;DR

Abstract

New perturbation bounds for low rank approximation of matrices via contour analysis

TL;DR

Abstract

Paper Structure

Table of Contents

Key Result

Figures (1)

Theorems & Definitions (38)