Table of Contents
Fetching ...

Learning to Discover Iterative Spectral Algorithms

Zihang Liu, Oleg Balabanov, Yaoqing Yang, Michael W. Mahoney

TL;DR

AutoSpec introduces a neural framework to discover spectrum-adaptive iterative algorithms for large-scale NLA and numerical optimization by predicting recurrence coefficients that define an executable matrix polynomial P(·). The approach uses coarse spectral probes and self-supervised task objectives, training on synthetic diagonal operators with transfer to real-world sparse matrices, and it yields learned recurrences that improve convergence and accuracy over baselines while exhibiting Chebyshev-like minimax properties. Two recurrence paradigms are explored: an affine three-term recurrence and a basis-generation followed by learned expansion, both parameterized by a neural engine and implemented in a matrix-free, low-storage fashion. Across eigenvalue problems, linear systems, and matrix-function tasks, AutoSpec achieves orders-of-magnitude improvements in iteration counts or operator-norm residuals, demonstrating robust performance under limited spectral information and suggesting a practical path to automated, state-of-the-art NLA methods.

Abstract

We introduce AutoSpec, a neural network framework for discovering iterative spectral algorithms for large-scale numerical linear algebra and numerical optimization. Our self-supervised models adapt to input operators using coarse spectral information (e.g., eigenvalue estimates and residual norms), and they predict recurrence coefficients for computing or applying a matrix polynomial tailored to a downstream task. The effectiveness of AutoSpec relies on three ingredients: an architecture whose inference pass implements short, executable numerical linear algebra recurrences; efficient training on small synthetic problems with transfer to large-scale real-world operators; and task-defined objectives that enforce the desired approximation or preconditioning behavior across the range of spectral profiles represented in the training set. We apply AutoSpec to discovering algorithms for representative numerical linear algebra tasks: accelerating matrix-function approximation; accelerating sparse linear solvers; and spectral filtering/preconditioning for eigenvalue computations. On real-world matrices, the learned procedures deliver orders-of-magnitude improvements in accuracy and/or reductions in iteration count, relative to basic baselines. We also find clear connections to classical theory: the induced polynomials often exhibit near-equiripple, near-minimax behavior characteristic of Chebyshev polynomials.

Learning to Discover Iterative Spectral Algorithms

TL;DR

AutoSpec introduces a neural framework to discover spectrum-adaptive iterative algorithms for large-scale NLA and numerical optimization by predicting recurrence coefficients that define an executable matrix polynomial P(·). The approach uses coarse spectral probes and self-supervised task objectives, training on synthetic diagonal operators with transfer to real-world sparse matrices, and it yields learned recurrences that improve convergence and accuracy over baselines while exhibiting Chebyshev-like minimax properties. Two recurrence paradigms are explored: an affine three-term recurrence and a basis-generation followed by learned expansion, both parameterized by a neural engine and implemented in a matrix-free, low-storage fashion. Across eigenvalue problems, linear systems, and matrix-function tasks, AutoSpec achieves orders-of-magnitude improvements in iteration counts or operator-norm residuals, demonstrating robust performance under limited spectral information and suggesting a practical path to automated, state-of-the-art NLA methods.

Abstract

We introduce AutoSpec, a neural network framework for discovering iterative spectral algorithms for large-scale numerical linear algebra and numerical optimization. Our self-supervised models adapt to input operators using coarse spectral information (e.g., eigenvalue estimates and residual norms), and they predict recurrence coefficients for computing or applying a matrix polynomial tailored to a downstream task. The effectiveness of AutoSpec relies on three ingredients: an architecture whose inference pass implements short, executable numerical linear algebra recurrences; efficient training on small synthetic problems with transfer to large-scale real-world operators; and task-defined objectives that enforce the desired approximation or preconditioning behavior across the range of spectral profiles represented in the training set. We apply AutoSpec to discovering algorithms for representative numerical linear algebra tasks: accelerating matrix-function approximation; accelerating sparse linear solvers; and spectral filtering/preconditioning for eigenvalue computations. On real-world matrices, the learned procedures deliver orders-of-magnitude improvements in accuracy and/or reductions in iteration count, relative to basic baselines. We also find clear connections to classical theory: the induced polynomials often exhibit near-equiripple, near-minimax behavior characteristic of Chebyshev polynomials.
Paper Structure (81 sections, 2 theorems, 27 equations, 14 figures, 4 tables, 3 algorithms)

This paper contains 81 sections, 2 theorems, 27 equations, 14 figures, 4 tables, 3 algorithms.

Key Result

Proposition 3.1

Consider eq:three_term_recurrence. If $\rho_k=0$ for all $1\le k\le d$, then $\mathbf{C}_k$ satisfies eq:affine_three_term with $\mathbf{C}_0=\mathbf{I}$ and $\mathbf{C}_1=\mathbf{X}$, and for $k\ge 1$,

Figures (14)

  • Figure 1: AutoSpec framework. (a) A trained neural network engine, paired with an executable recurrence, defines a discovered numerical algorithm for downstream NLA tasks. (b) An end-to-end differentiable algorithmic structure: given operator $\mathbf{X}$, a spectral probe $(\widehat{\boldsymbol{\lambda}}, \widehat{\boldsymbol{r}})$, consisting of coarse eigenvalue estimates and residual norms, is extracted and fed to the neural engine to produce the coefficients of a degree-$d$ polynomial $P(\cdot)$. The polynomial is implemented in a matrix-free manner via a short recurrence: starting from $\mathbf{V}_0 = \mathbf{z}$, we iterate $\mathbf{V}_{k+1} = \mathbf{M}_k(\mathbf{X}) \mathbf{V}_k$, where each $\mathbf{M}_k$ is parameterized by the engine. The terminal state returns the action $\mathbf{V}_{d+1} = P(\mathbf{X}) \mathbf{z}$ of the polynomial to inputs $\mathbf{z}$, which allows self-supervised training by backpropagating task-defined NLA losses on $\mathbf{V}_{d+1}$.
  • Figure 2: Convergence of eigenvalue approximations versus eigs outer iterations for (shifted) SiO2 and thermal2, for varying target numbers of eignevalues $k$ and subspace dimension $l=4k$. The NN preconditioner is produced using a spectral probe from 20 eigs warm-start iterations.
  • Figure 3: Convergence of CG solver with NN preconditioning using spectral probe computed with 50 Lanczos iterations
  • Figure 4: Whitening the covariance matrix of the E. coli K-12 MG1655 reference genome sequence.
  • Figure 5: Comparison of learned eigen preconditioners on SiO2 and thermal2, generated with versus without providing estimated eigenvalue residuals as input to the neural engine. Performance is reported as the log improvement in the target-boundary eigenvalue gap relative to standard subspace iteration baseline $P(\mathbf{X}) = \mathbf{X}^d$.
  • ...and 9 more figures

Theorems & Definitions (2)

  • Proposition 3.1: Affine three-term polynomial recurrence
  • Proposition 3.2: Expansion in a learned three-term basis