Table of Contents
Fetching ...

Improved convergence rate of kNN graph Laplacians

Yixuan Tan, Xiuyuan Cheng

TL;DR

The point-wise convergence of the $k$NN graph Laplacian to the limiting manifold operator (depending on $p$) is proved at the rate of $O(N^{-2/(d+6)}\,)$, up to a log factor, when $k_0$ and $\phi$ have $C^3$ regularity and satisfy other technical conditions.

Abstract

In graph-based data analysis, $k$-nearest neighbor ($k$NN) graphs are widely used due to their adaptivity to local data densities. Allowing weighted edges in the graph, the kernelized graph affinity provides a more general type of $k$NN graph where the $k$NN distance is used to set the kernel bandwidth adaptively. In this work, we consider a general class of $k$NN graph where the graph affinity is $W_{ij} = ε^{-d/2} \; k_0 ( \| x_i - x_j \|^2 / εφ( \widehatρ(x_i), \widehatρ(x_j) )^2 ) $, with $\widehatρ(x)$ being the (rescaled) $k$NN distance at the point $x$, $φ$ a symmetric bi-variate function, and $k_0$ a non-negative function on $[0,\infty)$. Under the manifold data setting, where $N$ i.i.d. samples $x_i$ are drawn from a density $p$ on a $d$-dimensional unknown manifold embedded in a high dimensional Euclidean space, we prove the point-wise convergence of the $k$NN graph Laplacian to the limiting manifold operator (depending on $p$) at the rate of $O(N^{-2/(d+6)}\,)$, up to a log factor, when $k_0$ and $φ$ have $C^3$ regularity and satisfy other technical conditions. This fast rate is obtained when $ε\sim N^{-2/(d+6)}\,$ and $k \sim N^{6/(d+6)}\,$, both at the optimal order to balance the theoretical bias and variance errors. When $k_0$ and $φ$ have lower regularities, including when $k_0$ is a compactly supported function as in the standard $k$NN graph, the convergence rate degenerates to $O(N^{-1/(d+4)}\,)$. Our improved convergence rate is based on a refined analysis of the $k$NN estimator, which can be of independent interest. We validate our theory by numerical experiments on simulated data.

Improved convergence rate of kNN graph Laplacians

TL;DR

The point-wise convergence of the NN graph Laplacian to the limiting manifold operator (depending on ) is proved at the rate of , up to a log factor, when and have regularity and satisfy other technical conditions.

Abstract

In graph-based data analysis, -nearest neighbor (NN) graphs are widely used due to their adaptivity to local data densities. Allowing weighted edges in the graph, the kernelized graph affinity provides a more general type of NN graph where the NN distance is used to set the kernel bandwidth adaptively. In this work, we consider a general class of NN graph where the graph affinity is , with being the (rescaled) NN distance at the point , a symmetric bi-variate function, and a non-negative function on . Under the manifold data setting, where i.i.d. samples are drawn from a density on a -dimensional unknown manifold embedded in a high dimensional Euclidean space, we prove the point-wise convergence of the NN graph Laplacian to the limiting manifold operator (depending on ) at the rate of , up to a log factor, when and have regularity and satisfy other technical conditions. This fast rate is obtained when and , both at the optimal order to balance the theoretical bias and variance errors. When and have lower regularities, including when is a compactly supported function as in the standard NN graph, the convergence rate degenerates to . Our improved convergence rate is based on a refined analysis of the NN estimator, which can be of independent interest. We validate our theory by numerical experiments on simulated data.

Paper Structure

This paper contains 78 sections, 38 theorems, 583 equations, 8 figures, 1 table.

Key Result

Lemma 2.2

Under Assumptions assump:M and assump:p, suppose $0\leq r \leq r_0$. (i) $\bar{\rho}_r(x)$ is well-defined, i.e., the equation eq:def-bar-rho-epsilon has a unique solution in $t \in (0, \rho_{\max})$. If $p \in C^l(\mathcal{M})$ for some integer $l \geq 3$, then $\bar{\rho}_r \in C^{l-2}(\mathcal{M}

Figures (8)

  • Figure 1: The empirical $k$NN bandwidth function $\hat{\rho}$ defined in \ref{['eq:def-hat-rho']} (marked with blue dots) computed from $N = 2000$ samples on a one-dimensional curve where $k= 32$ (Left) and 64 (Right). Compared with the population bandwidth function $\bar{\rho}_{r_k}$ defined as in Definition \ref{['def:bar-rho-epsilon']} (marked in red solid line) and $\bar{\rho} = p^{-1/d}$ (marked in orange dashed line).
  • Figure 2: The errors of different kernels plotted against values of $\sigma_0^2$, where $\text{Err}$ defined in \ref{['eq:def-L1-err']} (averaged over $2000$ runs) is shown in solid curves, and $\overline{\rm Err}$ defined in \ref{['eq:def-L1bar-err']} is shown in dashed lines. The fitted slopes on the log-log plots are also shown. (Left) Theoretical fast-rate cases (i) and (ii). (Right) Theoretical slow-rate cases (iii), (iv), and (v).
  • Figure A.1: The simulated data on a closed curve in $\mathbb{R}^4$. (a) The first three coordinates of 2,000 samples, where the color depth represents the density function $p$. (b) The density function $p$. (d) Test function $f$. (d) The function $\mathcal{L}_p f$ defined as in \ref{['eq:def-Delta-p']}. In (b)(c)(d), the functions are plotted against the intrinsic coordinate $t$ of the curve.
  • Figure A.2: Values of $k_0( \|x_0 - x_j \|^2 / ( \sigma_0^2 \phi(\hat{R}(x_0), \hat{R}(x_j))^2 ) )$ plotted against $x_j$ in its intrinsic coordinate on $[0,1]$ for a fixed $x_0$. The local neighborhood around $x_0$ used in the experiments is colored in grey. For each of the five types of affinities, we use the value of $\sigma_0$ that achieves the minimum Err in Figure \ref{['fig:converg-rates']}.
  • Figure A.3: $R(\alpha, \beta)$, defined in \ref{['eq:def-R-alpha-beta-fast']}, together with other quantities, plotted against $\alpha$ for a fixed value of $\beta$. The left and right plots correspond to the two cases discussed in the proof of Lemma \ref{['lemma:fast-rate']}, respectively.
  • ...and 3 more figures

Theorems & Definitions (90)

  • Definition 1.1
  • Definition 2.1: Population bandwidth function $\bar{\rho}_r$
  • Lemma 2.2: Construction of $\bar{\rho}_r$
  • Proposition 2.3
  • Theorem 2.4
  • Remark 2.1: The overall error at optimal $k$
  • Remark 2.2: The $(k/N)^{3/d}$ bias error
  • Example 3.1: Differentiable $\phi$
  • Example 3.2: $\max$ or $\min$ $\phi$
  • Remark 3.1: The influence of $\phi$ on the graph Laplacian estimation error
  • ...and 80 more