Embrace rejection: Kernel matrix approximation by accelerated randomly pivoted Cholesky

Ethan N. Epperly; Joel A. Tropp; Robert J. Webber

Embrace rejection: Kernel matrix approximation by accelerated randomly pivoted Cholesky

Ethan N. Epperly, Joel A. Tropp, Robert J. Webber

TL;DR

This work tackles the challenge of efficiently constructing low-rank PSD (kernel) approximations under a submatrix access model. It introduces accelerated RP-Chol-esky, which uses block pivots and rejection sampling to mimic the simple RP-Chol-esky pivot distribution, achieving substantial speedups (up to tens of times) without sacrificing approximation quality. Theoretical guarantees are developed via a novel expected residual function and permutation averaging, yielding a main error bound that informs critical parameter choices. Empirical results on a large kernel-matrix testbed and molecular PES problems demonstrate significant practical acceleration and reliable performance, with extensions to accelerated randomly pivoted QR. Overall, the method enables scalable kernel-based preconditioning and low-rank approximations for very large data sets in scientific computing contexts.

Abstract

Randomly pivoted Cholesky (RPCholesky) is an algorithm for constructing a low-rank approximation of a positive-semidefinite matrix using a small number of columns. This paper develops an accelerated version of RPCholesky that employs block matrix computations and rejection sampling to efficiently simulate the execution of the original algorithm. For the task of approximating a kernel matrix, the accelerated algorithm can run over $40\times$ faster. The paper contains implementation details, theoretical guarantees, experiments on benchmark data sets, and an application to computational chemistry.

Embrace rejection: Kernel matrix approximation by accelerated randomly pivoted Cholesky

TL;DR

Abstract

faster. The paper contains implementation details, theoretical guarantees, experiments on benchmark data sets, and an application to computational chemistry.

Paper Structure (37 sections, 11 theorems, 85 equations, 4 figures, 3 tables, 7 algorithms)

This paper contains 37 sections, 11 theorems, 85 equations, 4 figures, 3 tables, 7 algorithms.

Introduction
Access models for matrices
Kernel matrices and the submatrix access model
Efficient low-rank approximation in the submatrix access model
Simple RP-Chol-esky
Block RP-Chol-esky
Accelerated RP-Chol-esky
Illustrative comparison
Outline
Notation
Faster RP-Chol-esky by rejection sampling
Implementing RP-Chol-esky with rejection sampling
Efficient rejection sampling in the submatrix access model
Accelerated RP-Chol-esky
Low-memory implementation
...and 22 more sections

Key Result

Theorem 4.1

\newlabelthm:old_bound0 Consider a psd matrix $\boldsymbol{A} \in \mathbb{C}^{N \times N}$, and let $\boldsymbol{A}^{(k)}$ denote the random residual after applying simple or accelerated RP-Chol-esky with $k$ random pivots. Then as soon as

Figures (4)

Figure 1: Columns generated per second for Gaussian (left) and $\ell_1$ Laplace (right) kernel matrices with bandwidth $\sigma = 1$. The data consists of $N=10^5$ standard Gaussian points with dimension $d \in \{ 1, 10, 100, 1000 \}$.
Figure 1: Speed-up factor (top) and trace error ratio (eq. \ref{['eq:ratio']}, bottom) of block and accelerated RP-Chol-esky compared to simple RP-Chol-esky on the testbed of 125 kernel matrices. Examples are sorted by the block RP-Chol-esky speed-up factor (top) or the block RP-Chol-esky error ratio (bottom).
Figure 2: Runtime (left) and relative trace norm error \ref{['eq:relative-trace-error']} (right) of RP-Chol-esky, block RP-Chol-esky, and accelerated RP-Chol-esky applied to a Gaussian kernel matrix $\boldsymbol{A}$ associated with $10^5$ points in $\mathbb{R}^2$ forming a smile (inset, right). The blocked methods use block size $b=120$.
Figure 2: Runtime needed to generate the preconditioner (prep time) and run the necessary number of CG iterations until $\lVert \boldsymbol{A} \boldsymbol{\beta}^{(t)} - \boldsymbol{y} \rVert / \lVert \boldsymbol{y} \rVert < 10^{-3}$ (PCG time). Each calculation is performed once and the results are averaged over the eight molecules.

Theorems & Definitions (19)

Theorem 4.1: Sufficient pivots for simple and accelerated RP-Chol-esky CETW23
Theorem 4.2: Sufficient iterations for accelerated and block RP-Chol-esky
Theorem 4.3: Sufficient iterations for block RP-Chol-esky with a large block size DRVW06
Lemma 4.4: Central error bound
Lemma 4.5: Permutation averaging
Lemma 4.6: Expected residual properties
Proof 1
Lemma 4.7: Expected residual properties for matrices
Proof 2
Proposition 4.8: Block and accelerated RP-Chol-esky on a random matrix
...and 9 more

Embrace rejection: Kernel matrix approximation by accelerated randomly pivoted Cholesky

TL;DR

Abstract

Embrace rejection: Kernel matrix approximation by accelerated randomly pivoted Cholesky

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (4)

Theorems & Definitions (19)