Table of Contents
Fetching ...

Surrogate-based Autotuning for Randomized Sketching Algorithms in Regression Problems

Younghyun Cho, James W. Demmel, Michał Dereziński, Haoyun Li, Hengrui Luo, Michael W. Mahoney, Riley J. Murray

TL;DR

The paper tackles the parameter-tuning challenge in RandNLA, focusing on SAP-based randomized least squares. It introduces a surrogate-based autotuning pipeline using Gaussian-process-based Bayesian optimization, transfer learning, and a stochastic objective that balances runtime with solution accuracy via ARFE. Empirical results show near-optimal performance with substantially fewer evaluations than grid search, and transfer learning further accelerates tuning on new inputs, including real-world datasets. The approach is presented as a general workflow applicable to diverse RandNLA algorithms, with practical guidance on implementing and extending the autotuning framework. This work enables more reliable and efficient deployment of RandNLA methods in large-scale regression problems by automating tuning across problem instances and hardware platforms.

Abstract

Algorithms from Randomized Numerical Linear Algebra (RandNLA) are known to be effective in handling high-dimensional computational problems, providing high-quality empirical performance as well as strong probabilistic guarantees. However, their practical application is complicated by the fact that the user needs to set various algorithm-specific tuning parameters which are different than those used in traditional NLA. This paper demonstrates how a surrogate-based autotuning approach can be used to address fundamental problems of parameter selection in RandNLA algorithms. In particular, we provide a detailed investigation of surrogate-based autotuning for sketch-and-precondition (SAP) based randomized least squares methods, which have been one of the great success stories in modern RandNLA. Empirical results show that our surrogate-based autotuning approach can achieve near-optimal performance with much less tuning cost than a random search (up to about 4x fewer trials of different parameter configurations). Moreover, while our experiments focus on least squares, our results demonstrate a general-purpose autotuning pipeline applicable to any kind of RandNLA algorithm.

Surrogate-based Autotuning for Randomized Sketching Algorithms in Regression Problems

TL;DR

The paper tackles the parameter-tuning challenge in RandNLA, focusing on SAP-based randomized least squares. It introduces a surrogate-based autotuning pipeline using Gaussian-process-based Bayesian optimization, transfer learning, and a stochastic objective that balances runtime with solution accuracy via ARFE. Empirical results show near-optimal performance with substantially fewer evaluations than grid search, and transfer learning further accelerates tuning on new inputs, including real-world datasets. The approach is presented as a general workflow applicable to diverse RandNLA algorithms, with practical guidance on implementing and extending the autotuning framework. This work enables more reliable and efficient deployment of RandNLA methods in large-scale regression problems by automating tuning across problem instances and hardware platforms.

Abstract

Algorithms from Randomized Numerical Linear Algebra (RandNLA) are known to be effective in handling high-dimensional computational problems, providing high-quality empirical performance as well as strong probabilistic guarantees. However, their practical application is complicated by the fact that the user needs to set various algorithm-specific tuning parameters which are different than those used in traditional NLA. This paper demonstrates how a surrogate-based autotuning approach can be used to address fundamental problems of parameter selection in RandNLA algorithms. In particular, we provide a detailed investigation of surrogate-based autotuning for sketch-and-precondition (SAP) based randomized least squares methods, which have been one of the great success stories in modern RandNLA. Empirical results show that our surrogate-based autotuning approach can achieve near-optimal performance with much less tuning cost than a random search (up to about 4x fewer trials of different parameter configurations). Moreover, while our experiments focus on least squares, our results demonstrate a general-purpose autotuning pipeline applicable to any kind of RandNLA algorithm.
Paper Structure (41 sections, 1 theorem, 13 equations, 9 figures, 5 tables, 2 algorithms)

This paper contains 41 sections, 1 theorem, 13 equations, 9 figures, 5 tables, 2 algorithms.

Key Result

Proposition 3.1

\newlabelprop:precond_quality0 Let $\bm{U}$ be an $m \times k$ matrix whose columns are an orthonormal basis for the range of $\bm{A}$. If $\mathop{\mathrm{rank}}\nolimits(\bm{S}\bm{A}) = k$ and $\bm{M}$ is an $n \times k$ matrix for which $\bm{S}\bm{A}\bm{M}$ is orthogonal, then the spectrum of $

Figures (9)

  • Figure 1: Performance of a sketch-and-precondition (SAP) least squares algorithm with varying sketching matrices, i.e., with different sizes and/or different numbers of non-zeros in each row (nnz), for two different input matrices.
  • Figure 1: Graphical presentation of the tuning procedure in the tuning pipeline.
  • Figure 1: Landscape of parameter configurations over a grid of combinations of parameters. The labels on each plot represent the optimal performance and its parameter configuration in each category. For each data point shown in the figure, we run three safety_factor parameters ($0$, $2$, and $4$), and the plot shows the best result among these three safety_factor parameter values.
  • Figure 1: Tuning results for the three real-world input matrices for varying user-given constant parameters for $\textit{penalty\_factor}$, $\textit{allowance\_factor}$ and $\textit{safety\_factor}$.
  • Figure 2: Overview of our autotuning framework for SAP-based randomized least squares methods. The SAP methodology and its tuning opportunity trichotomy are detailed in Section \ref{['sec:sap_paradigm']}. The actual parameter types, their search space, and the tuning algorithms are detailed in Section \ref{['sec:autotuning']}.
  • ...and 4 more figures

Theorems & Definitions (1)

  • Proposition 3.1