Table of Contents
Fetching ...

When Locally Linear Embedding Hits Boundary

Hau-tieng Wu, Nan Wu

Abstract

Based on the Riemannian manifold model, we study the asymptotic behavior of a widely applied unsupervised learning algorithm, locally linear embedding (LLE), when the point cloud is sampled from a compact, smooth manifold with boundary. We show several peculiar behaviors of LLE near the boundary that are different from those diffusion-based algorithms. In particular, we show that LLE pointwisely converges to a mixed-type differential operator with degeneracy and we calculate the convergence rate. The impact of the hyperbolic part of the operator is discussed and we propose a clipped LLE algorithm which is a potential approach to recover the Dirichlet Laplace-Beltrami operator.

When Locally Linear Embedding Hits Boundary

Abstract

Based on the Riemannian manifold model, we study the asymptotic behavior of a widely applied unsupervised learning algorithm, locally linear embedding (LLE), when the point cloud is sampled from a compact, smooth manifold with boundary. We show several peculiar behaviors of LLE near the boundary that are different from those diffusion-based algorithms. In particular, we show that LLE pointwisely converges to a mixed-type differential operator with degeneracy and we calculate the convergence rate. The impact of the hyperbolic part of the operator is discussed and we propose a clipped LLE algorithm which is a potential approach to recover the Dirichlet Laplace-Beltrami operator.

Paper Structure

This paper contains 29 sections, 29 theorems, 265 equations, 11 figures.

Key Result

Proposition 2.1

The LLE matrix $W\in \mathbb{R}^{n\times n}$ satisfies $\rho(W) \geq 1$.

Figures (11)

  • Figure 1: The distribution of eigenvalues of the LLE matrix, where $W$ is constructed with $50$ nearest neighbors. In this example, the top eigenvalue is $1$.
  • Figure 2: The $\mathbf{T}$ vector field. The sampled point cloud is plotted in gray. Left: the black points indicates points satisfies $0.98\leq x^2+y^2\leq 1$, and the $\mathbf{T}$ on those points are marked in red. Right: the $\mathbf{T}$ on points with $x^2+y^2< 0.98$ are marked in red.
  • Figure 3: The kernel function associated with LLE. The sampled point cloud is plotted in gray. Left: the kernel function $K_\epsilon$, where $\epsilon = 0.1$, on two points, one is close to the boundary (indicated by the red circle, with the zoomed in enhanced visualization), and one is away from the boundary. It is clear that the kernel close to the boundary changes sign, while the kernel away from the boundary is positive. Right: the $\mathbb{E}K_\epsilon(x,X)$. It is clear that the expectations of the kernel at all points are positive.
  • Figure 4: Top left subfigure: the original dataset uniformly sampled from the unit disk superimposed with the radius as the color. Top middle (right respectively) subfigure: the embedding by the top two nontrivial eigenvectors of $W-I$ ($(W-I)^\top(W-I)$ respectively), where the radius of the original radius of each point is superimposed as the color. The top three nontrivial eigenvectors of $W-I$ are shown in the middle panel, and the top three nontrivial eigenvectors of $(W-I)^\top(W-I)$ are shown in the bottom panel.
  • Figure 5: Top: The first $5$ eigenfunctions of the LLE matrix for a point cloud sampled from the $[0,1]$ interval are plotted with different colors. The dashed vertical gray lines indicate $\epsilon$ and $1-\epsilon$. It is clear that the third, fourth, and fifth eigenfunctions "look like" the eigenfunctions of the Laplace-Beltrami operator with the Dirichlet boundary condition, but higher eigenfunctions become "irregular" when getting closer to the boundary. Middle: the first $10$ eigenfunctions of the clipped LLE matrix $W_r$ for a point cloud sampled from $M_1=[0,1]$ are plotted with different colors. The dashed vertical gray lines indicate $\epsilon$ and $1-\epsilon$. Bottom: the first $10$ eigenfunctions of the clipped LLE matrix $W_r$ for a point cloud sampled from the curve $M_3\subset \mathbb{R}^3$ are plotted with different colors. The dashed vertical gray lines indicate $\epsilon$ and $1-\epsilon$. Compared with those eigenfunctions of $M_1$, the amplitude of higher eigenfunctions of $M_3$ becomes less constant, which is expected due to the nonuniform sampling effect. It is clear that these eigenfunctions "look like" the eigenfunctions of the Laplace-Beltrami operator with the Dirichlet boundary condition without the "irregularity" behavior close to the boundary observed in the top panel.
  • ...and 6 more figures

Theorems & Definitions (48)

  • Proposition 2.1
  • Proposition 2.2
  • proof
  • Remark 3.1
  • Remark 3.2
  • Definition 3.1
  • Definition 3.2
  • Definition 4.1
  • Proposition 4.1
  • Remark 4.1
  • ...and 38 more