Table of Contents
Fetching ...

Parametric kernel low-rank approximations using tensor train decomposition

Abraham Khan, Arvind K. Saibaba

TL;DR

This work introduces Parametric TT Kernel (PTTK) methods to efficiently approximate parametric kernel matrices via multivariate Chebyshev interpolation compressed by tensor-train (TT) decomposition. The approach splits the computation into an offline stage, which is independent of hyperparameters, and an online stage, which evaluates a compact core H(θ) to assemble K(X,Y;θ) with complexity that scales linearly in Ns and Nt offline but is independent of them online. The authors provide a rigorous error decomposition, cost analyses, and a global low-rank variant that preserves symmetry and positive semidefiniteness when possible. Numerical experiments across nonparametric and parametric kernels—along with comparisons to ACA and real-data tests—demonstrate up to 200x online speedups and favorable accuracy, with extensions to higher dimensions and potential integration into hierarchical matrix frameworks. The work includes practical TT-cross initialization techniques and open-source code, highlighting a scalable path for parametric kernel evaluation in scientific computing and data science.

Abstract

Computing low-rank approximations of kernel matrices is an important problem with many applications in scientific computing and data science. We propose methods to efficiently approximate and store low-rank approximations to kernel matrices that depend on certain hyperparameters. The main idea behind our method is to use multivariate Chebyshev function approximation along with the tensor train decomposition of the coefficient tensor. The computations are in two stages: an offline stage, which dominates the computational cost and is parameter-independent, and an online stage, which is inexpensive and instantiated for specific hyperparameters. A variation of this method addresses the case that the kernel matrix is symmetric and positive semi-definite. The resulting algorithms have linear complexity in terms of the sizes of the kernel matrices. We investigate the efficiency and accuracy of our method on parametric kernel matrices induced by various kernels, such as the Matérn kernel, through various numerical experiments. Our methods have speedups up to $200\times$ in the online time compared to other methods with similar complexity and comparable accuracy.

Parametric kernel low-rank approximations using tensor train decomposition

TL;DR

This work introduces Parametric TT Kernel (PTTK) methods to efficiently approximate parametric kernel matrices via multivariate Chebyshev interpolation compressed by tensor-train (TT) decomposition. The approach splits the computation into an offline stage, which is independent of hyperparameters, and an online stage, which evaluates a compact core H(θ) to assemble K(X,Y;θ) with complexity that scales linearly in Ns and Nt offline but is independent of them online. The authors provide a rigorous error decomposition, cost analyses, and a global low-rank variant that preserves symmetry and positive semidefiniteness when possible. Numerical experiments across nonparametric and parametric kernels—along with comparisons to ACA and real-data tests—demonstrate up to 200x online speedups and favorable accuracy, with extensions to higher dimensions and potential integration into hierarchical matrix frameworks. The work includes practical TT-cross initialization techniques and open-source code, highlighting a scalable path for parametric kernel evaluation in scientific computing and data science.

Abstract

Computing low-rank approximations of kernel matrices is an important problem with many applications in scientific computing and data science. We propose methods to efficiently approximate and store low-rank approximations to kernel matrices that depend on certain hyperparameters. The main idea behind our method is to use multivariate Chebyshev function approximation along with the tensor train decomposition of the coefficient tensor. The computations are in two stages: an offline stage, which dominates the computational cost and is parameter-independent, and an online stage, which is inexpensive and instantiated for specific hyperparameters. A variation of this method addresses the case that the kernel matrix is symmetric and positive semi-definite. The resulting algorithms have linear complexity in terms of the sizes of the kernel matrices. We investigate the efficiency and accuracy of our method on parametric kernel matrices induced by various kernels, such as the Matérn kernel, through various numerical experiments. Our methods have speedups up to in the online time compared to other methods with similar complexity and comparable accuracy.
Paper Structure (41 sections, 3 theorems, 75 equations, 2 figures, 8 tables, 2 algorithms)

This paper contains 41 sections, 3 theorems, 75 equations, 2 figures, 8 tables, 2 algorithms.

Key Result

Lemma 1

\newlabellem:cheb_bound0 Let $\zeta_1, \zeta_2, \dots, \zeta_n$ be the $n$ Chebyshev nodes of the first kind in the interval $[-1, 1]$. Then where $\lambda_{n-1}$ is the Lebesgue constant; see trefethen2020approximation.

Figures (2)

  • Figure 1: Tensor network diagram of Phase 3 of the offline stage for the case $d=3$ and $d_\theta=2$, where each enclosing dashed rectangle represents the resulting matrix as a tensor network node, obtained by applying the face-splitting/Khatri-Rao product of the proper source/target factor matrix to the consequent matrix obtained after performing the contractions enclosed within the dashed rectangle.
  • Figure 2: Tensor network diagram of the online stage for fixed dimensions $d = 3$ and $d_\theta = 2$ with a particular parameter $\boldsymbol{\theta} = (\theta_1, \theta_2)$, obtained after performing the operations illustrated in Figure \ref{['fig:offline_mode']}.

Theorems & Definitions (6)

  • Lemma 1
  • Proof 1
  • Proposition 1
  • Proof 2: Proof of Proposition \ref{['prop:error']}
  • Proposition 2
  • Proof 3