Randomly pivoted Cholesky: Practical approximation of a kernel matrix with few entry evaluations

Yifan Chen; Ethan N. Epperly; Joel A. Tropp; Robert J. Webber

Randomly pivoted Cholesky: Practical approximation of a kernel matrix with few entry evaluations

Yifan Chen, Ethan N. Epperly, Joel A. Tropp, Robert J. Webber

TL;DR

A thorough new investigation of the empirical and theoretical behavior of the randomly pivoted Cholesky algorithm, which provably returns low‐rank approximations that are nearly optimal for matrix approximation problems that arise in scientific machine learning.

Abstract

The randomly pivoted partial Cholesky algorithm (RPCholesky) computes a factorized rank-k approximation of an N x N positive-semidefinite (psd) matrix. RPCholesky requires only (k + 1) N entry evaluations and O(k^2 N) additional arithmetic operations, and it can be implemented with just a few lines of code. The method is particularly useful for approximating a kernel matrix. This paper offers a thorough new investigation of the empirical and theoretical behavior of this fundamental algorithm. For matrix approximation problems that arise in scientific machine learning, experiments show that RPCholesky matches or beats the performance of alternative algorithms. Moreover, RPCholesky provably returns low-rank approximations that are nearly optimal. The simplicity, effectiveness, and robustness of RPCholesky strongly support its use in scientific computing and machine learning applications.

Randomly pivoted Cholesky: Practical approximation of a kernel matrix with few entry evaluations

TL;DR

Abstract

Paper Structure (53 sections, 13 theorems, 129 equations, 3 figures, 3 tables, 7 algorithms)

This paper contains 53 sections, 13 theorems, 129 equations, 3 figures, 3 tables, 7 algorithms.

Motivation
Plan for paper
Notation
Randomly Pivoted Cholesky
Nyström approximation of a psd matrix
The pivoted partial Cholesky algorithm
Pivot selection rules
Greedy pivoting
Uniform random pivoting
Adaptive random pivoting
Pivoting with a Gibbs distribution
Numerical results: Illustrative examples
Numerical results: Real data
Theoretical results
History, Related Work, and Extensions
...and 38 more sections

Key Result

Theorem 2.3

Fix $r \in \mathbb{N}$ and $\varepsilon > 0$, and let $\boldsymbol{A}$ be a psd matrix. The column Nyström approximation $\boldsymbol{\widehat{A}}^{(k)}$ produced by RP-Chol-esky (alg:rpcholesky) attains the error bound eq:1+eps provided that the number of columns, $k$, satisfies

Figures (3)

Figure 1: Rank-$k$ approximation of Gaussian kernel matrices. Median relative trace-norm error $\mathop{\mathrm{tr}}\nolimits \bigl(\boldsymbol{A} - \boldsymbol{\widehat{A}}^{(k)}\bigr) \slash \mathop{\mathrm{tr}}\nolimits \boldsymbol{A}$ and 20-80% quantile bars for several Nyström-based column approximation methods for Smile (left) and Spiral (right) examples. Selected pivots (colored stars) and data points (gray circles) for uniform, greedy, and RP-Chol-esky methods are shown next to each panel.
Figure 2: Kernel ridge regression for QM9 data.Left: Prediction error \ref{['eq:smape']} for several Nyström algorithms. Right: Relative trace-norm error.
Figure 3: Spectral clustering for alanine dipeptide trajectories.Top left: Misclassification rate, averaged over 1000 independent trials. Top right: Example of correct clustering ($<0.2\%$ misclassification) produced by RP-Chol-esky with rank $k = 150$. Bottom: Incorrect clusterings ($>2\%$ misclassification) produced by uniform, RLS and greedy sampling with rank $k = 150$. Black dots mark data points selected as pivots.

Theorems & Definitions (19)

Theorem 2.3: Randomly pivoted Cholesky: simplified bound
Proposition 3.2: Deshpande et al. DRVW06
Theorem 5.1: Randomly pivoted Cholesky
Corollary 5.2: Randomly pivoted QR
Lemma 5.3: Expected residual
Lemma 5.4: Contraction rate
Lemma 5.5: Error doubling
proof : Proof of \ref{['thm:main_bound']}
Theorem C.1: Nyström lower bound DV06GS12
proof
...and 9 more

Randomly pivoted Cholesky: Practical approximation of a kernel matrix with few entry evaluations

TL;DR

Abstract

Randomly pivoted Cholesky: Practical approximation of a kernel matrix with few entry evaluations

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (3)

Theorems & Definitions (19)