Feature maps for the Laplacian kernel and its generalizations

Sudhendu Ahir; Parthe Pandit

Feature maps for the Laplacian kernel and its generalizations

Sudhendu Ahir, Parthe Pandit

TL;DR

This work tackles the challenge of efficiently approximating the non-separable Laplacian kernel and its generalizations (Matérn and Exponential-power) by developing two scalable random-feature families, RFF and ORF, that accommodate anisotropic covariance $M$ and heavy-tailed weight distributions. It derives explicit, implementable weight-sampling schemes for the Laplacian, Matérn, and Exponential-power kernels and proves that the associated random-feature maps converge to the exact kernels as the feature count $p$ grows, even under anisotropy. The authors provide detailed Fourier-transform-based weight constructions, accompanying sampling algorithms (including elliptically contoured $\alpha$-stable, multivariate $t$, and Cauchy distributions), and extensive numerical validation on real datasets, showing speedups and improved calibration for kernel logistic regression. The results offer a practical pathway to scalable, kernel-based learning with non-separable kernels, enabling efficient experimentation and deployment in large-scale settings while retaining theoretical guarantees.

Abstract

Recent applications of kernel methods in machine learning have seen a renewed interest in the Laplacian kernel, due to its stability to the bandwidth hyperparameter in comparison to the Gaussian kernel, as well as its expressivity being equivalent to that of the neural tangent kernel of deep fully connected networks. However, unlike the Gaussian kernel, the Laplacian kernel is not separable. This poses challenges for techniques to approximate it, especially via the random Fourier features (RFF) methodology and its variants. In this work, we provide random features for the Laplacian kernel and its two generalizations: Matérn kernel and the Exponential power kernel. We provide efficiently implementable schemes to sample weight matrices so that random features approximate these kernels. These weight matrices have a weakly coupled heavy-tailed randomness. Via numerical experiments on real datasets we demonstrate the efficacy of these random feature maps.

Feature maps for the Laplacian kernel and its generalizations

TL;DR

Abstract

Feature maps for the Laplacian kernel and its generalizations

TL;DR

Abstract

Paper Structure

Table of Contents

Key Result

Figures (6)

Theorems & Definitions (26)