Table of Contents
Fetching ...

Make the most of what you have: Resource-efficient randomized algorithms for matrix computations

Ethan N. Epperly

TL;DR

This work develops resource-efficient randomized algorithms for matrix computations, unifying PSD low-rank approximation, implicit-matrix attribute estimation, and stabilized randomized least-squares. It introduces randomly pivoted Cholesky to achieve near-optimal column Nyström approximations using limited entry access, and leverages the Gram correspondence to connect projection and Nyström frameworks. It also analyzes leave-one-out estimators for implicit matrices, demonstrates backward-stable randomized LS, and extends these ideas to kernel methods and Gaussian processes, including infinite-dimensional extensions via rejection sampling. The combination of theoretical guarantees and practical algorithms offers scalable, accurate tools for kernelized learning, large-scale linear algebra, and related domains, with demonstrated benefits in kernel interpolation, GP regression, and preconditioning for KRR. The work thus provides a cohesive toolkit for deploying randomized matrix methods in practical scientific computing contexts.

Abstract

In recent years, randomized algorithms have established themselves as fundamental tools in computational linear algebra, with applications in scientific computing, machine learning, and quantum information science. Many randomized matrix algorithms proceed by first collecting information about a matrix and then processing that data to perform some computational task. This thesis addresses the following question: How can one design algorithms that use this information as efficiently as possible, reliably achieving the greatest possible speed and accuracy for a limited data budget? The first part of this thesis focuses on low-rank approximation for positive-semidefinite matrices. Here, the goal is to compute an accurate approximation to a matrix after accessing as few entries of the matrix as possible. This part of the thesis explores the randomly pivoted Cholesky (RPCholesky) algorithm for this task, which achieves a level of speed and reliability greater than other methods for the same problem. The second part of this thesis considers the task of estimating attributes of an implicit matrix accessible only by matrix-vector products. This thesis describes the leave-one-out approach to developing matrix attribute estimation algorithms and develops optimized trace, diagonal, and row-norm estimation algorithms. The third part of this thesis considers randomized algorithms for overdetermined linear least squares problems. Randomized algorithms for linear-least squares problems are asymptotically faster than any known deterministic algorithm, but recent work has raised questions about the accuracy of these methods in floating point arithmetic. This thesis shows these issues are resolvable by developing fast randomized least-squares problem achieving backward stability, the gold-standard stability guarantee for a numerical algorithm.

Make the most of what you have: Resource-efficient randomized algorithms for matrix computations

TL;DR

This work develops resource-efficient randomized algorithms for matrix computations, unifying PSD low-rank approximation, implicit-matrix attribute estimation, and stabilized randomized least-squares. It introduces randomly pivoted Cholesky to achieve near-optimal column Nyström approximations using limited entry access, and leverages the Gram correspondence to connect projection and Nyström frameworks. It also analyzes leave-one-out estimators for implicit matrices, demonstrates backward-stable randomized LS, and extends these ideas to kernel methods and Gaussian processes, including infinite-dimensional extensions via rejection sampling. The combination of theoretical guarantees and practical algorithms offers scalable, accurate tools for kernelized learning, large-scale linear algebra, and related domains, with demonstrated benefits in kernel interpolation, GP regression, and preconditioning for KRR. The work thus provides a cohesive toolkit for deploying randomized matrix methods in practical scientific computing contexts.

Abstract

In recent years, randomized algorithms have established themselves as fundamental tools in computational linear algebra, with applications in scientific computing, machine learning, and quantum information science. Many randomized matrix algorithms proceed by first collecting information about a matrix and then processing that data to perform some computational task. This thesis addresses the following question: How can one design algorithms that use this information as efficiently as possible, reliably achieving the greatest possible speed and accuracy for a limited data budget? The first part of this thesis focuses on low-rank approximation for positive-semidefinite matrices. Here, the goal is to compute an accurate approximation to a matrix after accessing as few entries of the matrix as possible. This part of the thesis explores the randomly pivoted Cholesky (RPCholesky) algorithm for this task, which achieves a level of speed and reliability greater than other methods for the same problem. The second part of this thesis considers the task of estimating attributes of an implicit matrix accessible only by matrix-vector products. This thesis describes the leave-one-out approach to developing matrix attribute estimation algorithms and develops optimized trace, diagonal, and row-norm estimation algorithms. The third part of this thesis considers randomized algorithms for overdetermined linear least squares problems. Randomized algorithms for linear-least squares problems are asymptotically faster than any known deterministic algorithm, but recent work has raised questions about the accuracy of these methods in floating point arithmetic. This thesis shows these issues are resolvable by developing fast randomized least-squares problem achieving backward stability, the gold-standard stability guarantee for a numerical algorithm.

Paper Structure

This paper contains 261 sections, 77 theorems, 833 equations, 44 figures, 3 tables.

Key Result

proposition 1

Let $\mat{B} \in \field^{m\times n}$ and $\mat{\Omega} \in \field^{n\times k}$ be matrices, and consider the projection approximation $\mat{\Pi}_{\mat{B}\mat{\Omega}} \mat{B}$. Then

Figures (44)

  • Figure 1: List of chapters in this thesis and dependencies between them. Blue circles indicate chapters whose content is primarily introductory or expository, orange squares indicate sections primarily containing research from my PhD, and green squircles indicate open questions. Starred sections contain new research that has not previously been published.
  • Figure 2: Left: Univariate restrictions $\kappa(\cdot,a)$ of Sobolev kernel \ref{['eq:sobolev-kernel']} for $a=0.25$ and $a=0.5$. Right: Contour plot of Sobolev kernel \ref{['eq:sobolev-kernel']}.
  • Figure 3: RKHS function (left) and single draw of a Gaussian process (right) for the same positive-definite kernel $\kappa$. The draw from the Gaussian process is observed to be much "rougher" than the RKHS function.
  • Figure 4: Fitting of Nobel laureate ages by prize year using GPR for three values of the regularization $\lambda = 10^{-13}$ (left), $\lambda = 1$ (middle), and $\lambda = 100$ (right); further details are in the text.
  • Figure 5: Relative residual (left) and SMAPE test error (right) for column Nyström-preconditioned conjugate gradient with greedy (blue circles), uniform (purple squares), (orange asterisks), and no preconditioning (yellow).
  • ...and 39 more figures

Theorems & Definitions (195)

  • definition 1: Projection approximation
  • proposition 1: Properties of projection approximations
  • remark 1: Left versus right
  • remark 2: Counting subspace iteration steps
  • remark 3: Block Krylov iteration
  • definition 2: Nyström approximation
  • proposition 2: Properties of the Nyström approximtion
  • remark 4: Hermitian indefinite matrices
  • theorem 1: Gram correspondence
  • proof
  • ...and 185 more