Table of Contents
Fetching ...

A spectral mixture representation of isotropic kernels to generalize random Fourier features

Nicolas Langrené, Xavier Warin, Pierre Gruet

TL;DR

This work extends the Random Fourier Features framework to a broad class of positive definite isotropic shift-invariant kernels by proving that their spectral distributions are scale mixtures of symmetric stable vectors. The key insight is that for any K in \\Phi_\\infty, the associated kernel K_\\alpha(u)=k(\\|u\\|^\\alpha) admits a random projection \η_\\alpha=R^{1/\\alpha}\\S_\\alpha, with further Gaussian mixtures \S_\\alpha=\\sqrt{2A_\\alpha} N and an explicit form for A_\\alpha, enabling practical sampling via univariate R, A_\\alpha, and Gaussian N. This framework unifies and generalizes kernel spectra to include exponential power, Matérn, generalized Cauchy, and newly introduced Beta, Kummer, and Tricomi kernels, while highlighting the efficiency advantage of isotropic over tensor constructions. The results enable easy kernel design and learning by manipulating the mixing distribution of R, with numerical experiments validating exactness and convergence and illustrating broader applicability of RFF across diverse kernels.

Abstract

Rahimi and Recht (2007) introduced the idea of decomposing positive definite shift-invariant kernels by randomly sampling from their spectral distribution. This famous technique, known as Random Fourier Features (RFF), is in principle applicable to any such kernel whose spectral distribution can be identified and simulated. In practice, however, it is usually applied to the Gaussian kernel because of its simplicity, since its spectral distribution is also Gaussian. Clearly, simple spectral sampling formulas would be desirable for broader classes of kernels. In this paper, we show that the spectral distribution of positive definite isotropic kernels in $\mathbb{R}^{d}$ for all $d\geq1$ can be decomposed as a scale mixture of $α$-stable random vectors, and we identify the mixing distribution as a function of the kernel. This constructive decomposition provides a simple and ready-to-use spectral sampling formula for many multivariate positive definite shift-invariant kernels, including exponential power kernels, generalized Matérn kernels, generalized Cauchy kernels, as well as newly introduced kernels such as the Beta, Kummer, and Tricomi kernels. In particular, we retrieve the fact that the spectral distributions of these kernels are scale mixtures of the multivariate Gaussian distribution, along with an explicit mixing distribution formula. This result has broad applications for support vector machines, kernel ridge regression, Gaussian processes, and other kernel-based machine learning techniques for which the random Fourier features technique is applicable.

A spectral mixture representation of isotropic kernels to generalize random Fourier features

TL;DR

This work extends the Random Fourier Features framework to a broad class of positive definite isotropic shift-invariant kernels by proving that their spectral distributions are scale mixtures of symmetric stable vectors. The key insight is that for any K in \\Phi_\\infty, the associated kernel K_\\alpha(u)=k(\\|u\\|^\\alpha) admits a random projection \η_\\alpha=R^{1/\\alpha}\\S_\\alpha, with further Gaussian mixtures \S_\\alpha=\\sqrt{2A_\\alpha} N and an explicit form for A_\\alpha, enabling practical sampling via univariate R, A_\\alpha, and Gaussian N. This framework unifies and generalizes kernel spectra to include exponential power, Matérn, generalized Cauchy, and newly introduced Beta, Kummer, and Tricomi kernels, while highlighting the efficiency advantage of isotropic over tensor constructions. The results enable easy kernel design and learning by manipulating the mixing distribution of R, with numerical experiments validating exactness and convergence and illustrating broader applicability of RFF across diverse kernels.

Abstract

Rahimi and Recht (2007) introduced the idea of decomposing positive definite shift-invariant kernels by randomly sampling from their spectral distribution. This famous technique, known as Random Fourier Features (RFF), is in principle applicable to any such kernel whose spectral distribution can be identified and simulated. In practice, however, it is usually applied to the Gaussian kernel because of its simplicity, since its spectral distribution is also Gaussian. Clearly, simple spectral sampling formulas would be desirable for broader classes of kernels. In this paper, we show that the spectral distribution of positive definite isotropic kernels in for all can be decomposed as a scale mixture of -stable random vectors, and we identify the mixing distribution as a function of the kernel. This constructive decomposition provides a simple and ready-to-use spectral sampling formula for many multivariate positive definite shift-invariant kernels, including exponential power kernels, generalized Matérn kernels, generalized Cauchy kernels, as well as newly introduced kernels such as the Beta, Kummer, and Tricomi kernels. In particular, we retrieve the fact that the spectral distributions of these kernels are scale mixtures of the multivariate Gaussian distribution, along with an explicit mixing distribution formula. This result has broad applications for support vector machines, kernel ridge regression, Gaussian processes, and other kernel-based machine learning techniques for which the random Fourier features technique is applicable.

Paper Structure

This paper contains 5 sections, 6 theorems, 15 equations, 16 figures, 1 table.

Key Result

Lemma 1

For any $\alpha\in(0,2]$, let $\boldsymbol{S}_{\alpha}$ be a $d$-dimensional symmetric stable random vector (Definition def:symmetric_stable), let $R$ be a real-valued nonnegative random variable, independent of $\boldsymbol{S}_{\alpha}$, with Laplace transform $\mathcal{L}$, and let $\lambda>0$ be spans the following isotropic kernel $K:\mathbb{R}^{d}\rightarrow\mathbb{R}$:

Figures (16)

  • Figure 1: Univariate Gaussian kernel and its random Fourier features approximation \ref{['eq:random_fourier_features']} using $M=1000$ random projections with Gaussian distribution.
  • Figure 2: Bivariate Gaussian kernel (right) and its random Fourier features approximation \ref{['eq:random_fourier_features']} (left) using $M=4000$ random projections.
  • Figure 3: Univariate Matérn-$3/2$ kernel and its random Fourier features approximation \ref{['eq:random_fourier_features']} using $M=1000$ random projections.
  • Figure 4: Bivariate Matérn-$3/2$ kernel (right) and its random Fourier features approximation \ref{['eq:random_fourier_features']} (left) using $M=4000$ random projections.
  • Figure 5: Univariate Laplace (a.k.a. Matérn-$1/2$) kernel and its random Fourier features approximation \ref{['eq:random_fourier_features']} using $M=1000$ random projections.
  • ...and 11 more figures

Theorems & Definitions (18)

  • Definition 1
  • Lemma 1
  • proof
  • Theorem 1
  • proof
  • Corollary 1
  • Corollary 2
  • proof
  • Remark 1
  • Remark 2
  • ...and 8 more