Parametric Hierarchical Matrix Approximations to Kernel Matrices
Abraham Khan, Chao Chen, Vishwas Rao, Arvind K. Saibaba
TL;DR
This work addresses the high cost of repeatedly forming kernel matrix approximations across varying hyperparameters. It introduces parametric H- and H2-matrices, built on Chebyshev interpolation and tensor-train compression (TT, TT-cross), enabling an offline stage that precomputes a parametric representation over the parameter space and an online stage that instantaneously instantiates a kernel matrix for a fixed parameter without extra kernel evaluations. The key contributions are the TT-based parametric far-field and near-field approximations (via PTTK for far-field and a TT-augmented near-field scheme), detailed cost analyses showing offline $O(n\, ext{log}n)$ or $O(n)$ behavior and linear-time online MVMs, and extensive numerical experiments demonstrating 100x+ speedups across multiple kernels and parameter ranges. The results indicate practical, scalable, and accurate kernel-matrix approximations suitable for GP-based learning and inverse problems, with strong potential for translation-invariant kernels and large-scale datasets. Overall, the proposed parametric hierarchical matrices offer a robust framework for efficient parameter-dependent kernel computations in scientific computing and machine learning contexts.
Abstract
Kernel matrices are ubiquitous in computational mathematics, often arising from applications in machine learning and scientific computing. In two or three spatial or feature dimensions, such problems can be approximated efficiently by a class of matrices known as hierarchical matrices. A hierarchical matrix consists of a hierarchy of small near-field blocks (or sub-matrices) stored in a dense format and large far-field blocks approximated by low-rank matrices. Standard methods for forming hierarchical matrices do not account for the fact that kernel matrices depend on specific hyperparameters; for example, in the context of Gaussian processes, hyperparameters must be optimized over a fixed parameter space. We introduce a new class of hierarchical matrices, namely, parametric (parameter-dependent) hierarchical matrices. Members of this new class are parametric $\mathcal{H}$-matrices and parametric $\mathcal{H}^{2}$-matrices. The construction of a parametric hierarchical matrix follows an offline-online paradigm. In the offline stage, the near-field and far-field blocks are approximated by using polynomial approximation and tensor compression. In the online stage, for a particular hyperparameter, the parametric hierarchical matrix is instantiated efficiently as a standard hierarchical matrix. The asymptotic costs for storage and computation in the offline stage are comparable to the corresponding standard approaches of forming a hierarchical matrix. However, the online stage of our approach requires no new kernel evaluations, and the far-field blocks can be computed more efficiently than standard approaches. {Numerical experiments show over $100\times$ speedups compared with existing techniques.}
