Table of Contents
Fetching ...

Conformalized Robust Principal Component Analysis

Liangliang Yuan, Lei Wang, Quan Kong, Liuhua Peng

Abstract

Robust principal component analysis (RPCA) is a widely used technique for recovering low-rank structure from matrices with missing entries and sparse, possibly large-magnitude corruptions. Although numerous algorithms achieve accurate point estimation, they offer little guidance on the uncertainty of recovered entries, limiting their reliability in practice. In this paper, we propose conformal prediction-RPCA (CP-RPCA), a practical and distribution-free framework for uncertainty quantification in robust matrix recovery. Our proposed method supports both split and full conformal implementations and incorporates weighted calibration to handle heterogeneous observation probabilities. We provide theoretical guarantees for finite-sample coverage and demonstrate through extensive simulations that CP-RPCA delivers reliable uncertainty quantification under severe outliers, missing data and model misspecification. Empirical results show that CP-RPCA can produce informative intervals and remain competitive in efficiency when the RPCA model is well specified, making it a scalable and robust tool for uncertainty-aware matrix analysis.

Conformalized Robust Principal Component Analysis

Abstract

Robust principal component analysis (RPCA) is a widely used technique for recovering low-rank structure from matrices with missing entries and sparse, possibly large-magnitude corruptions. Although numerous algorithms achieve accurate point estimation, they offer little guidance on the uncertainty of recovered entries, limiting their reliability in practice. In this paper, we propose conformal prediction-RPCA (CP-RPCA), a practical and distribution-free framework for uncertainty quantification in robust matrix recovery. Our proposed method supports both split and full conformal implementations and incorporates weighted calibration to handle heterogeneous observation probabilities. We provide theoretical guarantees for finite-sample coverage and demonstrate through extensive simulations that CP-RPCA delivers reliable uncertainty quantification under severe outliers, missing data and model misspecification. Empirical results show that CP-RPCA can produce informative intervals and remain competitive in efficiency when the RPCA model is well specified, making it a scalable and robust tool for uncertainty-aware matrix analysis.
Paper Structure (39 sections, 10 theorems, 74 equations, 9 figures, 1 table, 4 algorithms)

This paper contains 39 sections, 10 theorems, 74 equations, 9 figures, 1 table, 4 algorithms.

Key Result

Lemma 1

If $(i_\ast,j_\ast)\mid\Omega_{\mathrm{obs}}\sim\mathrm{Unif}(\Omega_{\mathrm{obs}}^c)$, then where we define the weight $\omega_{i_k j_k}=h_{i_kj_k}/\sum_{k^{\prime}=1}^{n_{\mathrm{cal}}+1}{h_{i_{k^{\prime}}j_{k^{\prime}}}}$ for odds ratios given by $h_{ij}=(1-p_{ij})/p_{ij}$.

Figures (9)

  • Figure 1: Relationships among index sets in the two-stage CP-RPCA framework
  • Figure 2: Comparison of coverage effects under different observation modes and noise distributions
  • Figure 3: Comparison of recovery effects under heterogeneous noise
  • Figure 4: Comparison between RPCA and CP-RPCA
  • Figure 5: YaleB11 Face Feature Extraction and Confidence Interval
  • ...and 4 more figures

Theorems & Definitions (14)

  • Remark 1
  • Lemma 1
  • Corollary 1
  • Remark 2
  • Theorem 1
  • Corollary 2
  • Corollary 3
  • Remark 3
  • Remark 4
  • Theorem 2
  • ...and 4 more