Table of Contents
Fetching ...

A Bound on the Maximal Marginal Degrees of Freedom

Paul Dommel

TL;DR

Kernel ridge regression is memory- and compute-intensive for large datasets. The paper introduces a Legendre-polynomial based weight construction to bound the continuous maximal marginal degrees of freedom $\mathcal{N}_{\infty}(\lambda)$ for radial kernels, yielding explicit rates. Under polynomial or exponential decay of the kernel-series remainder $\mathcal{E}_{\phi}$, the bounds are $\mathcal{N}_{\infty}(\lambda) \le C_s \lambda^{-\frac{2d}{s-d-\frac{1}{2}}}$ or $\mathcal{N}_{\infty}(\lambda) \le C_{\rho} \ln(\lambda^{-1})^{2d}$, respectively, which implies that the Nyström method needs only $m$ centers on the order of these bounds and thus runs almost linear in $n$ for smooth kernels. The results also broaden feasible regularization choices beyond $\mathcal{O}(n^{-1})$ and connect discrete Nyström analysis with the continuous operator framework. This provides a solid theoretical justification for Nyström-based kernel surrogates and points to practical, scalable kernel learning with weaker regularization constraints.

Abstract

Kernel ridge regression, in general, is expensive in memory allocation and computation time. This paper addresses low rank approximations and surrogates for kernel ridge regression, which bridge these difficulties. The fundamental contribution of the paper is a lower bound on the minimal rank such that the prediction power of the approximation remains reliable. Based on this bound, we demonstrate that the computational cost of the most popular low rank approach, which is the Nyström method, is almost linear in the sample size. This justifies the method from a theoretical point of view. Moreover, the paper provides a significant extension of the feasible choices of the regularization parameter. The result builds on a thorough theoretical analysis of the approximation of elementary kernel functions by elements in the range of the associated integral operator. We provide estimates of the approximation error and characterize the behavior of the norm of the underlying weight function.

A Bound on the Maximal Marginal Degrees of Freedom

TL;DR

Kernel ridge regression is memory- and compute-intensive for large datasets. The paper introduces a Legendre-polynomial based weight construction to bound the continuous maximal marginal degrees of freedom for radial kernels, yielding explicit rates. Under polynomial or exponential decay of the kernel-series remainder , the bounds are or , respectively, which implies that the Nyström method needs only centers on the order of these bounds and thus runs almost linear in for smooth kernels. The results also broaden feasible regularization choices beyond and connect discrete Nyström analysis with the continuous operator framework. This provides a solid theoretical justification for Nyström-based kernel surrogates and points to practical, scalable kernel learning with weaker regularization constraints.

Abstract

Kernel ridge regression, in general, is expensive in memory allocation and computation time. This paper addresses low rank approximations and surrogates for kernel ridge regression, which bridge these difficulties. The fundamental contribution of the paper is a lower bound on the minimal rank such that the prediction power of the approximation remains reliable. Based on this bound, we demonstrate that the computational cost of the most popular low rank approach, which is the Nyström method, is almost linear in the sample size. This justifies the method from a theoretical point of view. Moreover, the paper provides a significant extension of the feasible choices of the regularization parameter. The result builds on a thorough theoretical analysis of the approximation of elementary kernel functions by elements in the range of the associated integral operator. We provide estimates of the approximation error and characterize the behavior of the norm of the underlying weight function.
Paper Structure (11 sections, 13 theorems, 91 equations)

This paper contains 11 sections, 13 theorems, 91 equations.

Key Result

Theorem 3.2

Let $\mathcal{X}=[0,1]^{d}$ be equipped with a design measure $P$, such that its density $p$ satisfies Further, let $k$ be a kernel of the form eq:KernelShape. Then, provided $\lambda$ is sufficiently small, the maximal marginal degrees of freedom fulfill the bound if Assumption asuA is satisfied. Further, it holds that if Assumption asuB is fulfilled. $C_{s}$ and $C_{\rho}$ are constants depen

Theorems & Definitions (31)

  • Remark 3.1
  • Theorem 3.2: Bound on $\mathcal{N}_{\infty}(\lambda)$
  • Remark 3.3
  • Remark 3.4
  • Proposition 3.5: Nyström method
  • Proposition 3.6
  • Remark 3.7
  • Proposition 5.1: Properties $Q_{k}$
  • proof
  • Proposition 5.2
  • ...and 21 more