Table of Contents
Fetching ...

Manifold Diffusion Geometry: Curvature, Tangent Spaces, and Dimension

Iolo Jones

TL;DR

This work addresses robust geometric inference from finite, noisy samples that lie near manifolds. It develops diffusion-geometry–based estimators for the Laplacian, carré du champ, tangent spaces, pointwise and global dimension, and curvature tensors (Riemann, Ricci, scalar), using diffusion maps with variable bandwidth kernels. The approach yields parameter-free, noise-robust estimators that perform comparably to state-of-the-art on clean data but significantly outperform competitors under noise or sparsity, including improved dimension estimates and feasible curvature estimation. By grounding geometric quantities in diffusion geometry and the second fundamental form, the method enables reliable geometric ML on real-world, imperfect data and suggests directions for extending these tools to non-manifold data and higher-dimensional settings.

Abstract

We introduce novel estimators for computing the curvature, tangent spaces, and dimension of data from manifolds, using tools from diffusion geometry. Although classical Riemannian geometry is a rich source of inspiration for geometric data analysis and machine learning, it has historically been hard to implement these methods in a way that performs well statistically. Diffusion geometry lets us develop Riemannian geometry methods that are accurate and, crucially, also extremely robust to noise and low-density data. The methods we introduce here are comparable to the existing state-of-the-art on ideal dense, noise-free data, but significantly outperform them in the presence of noise or sparsity. In particular, our dimension estimate improves on the existing methods on a challenging benchmark test when even a small amount of noise is added. Our tangent space and scalar curvature estimates do not require parameter selection and substantially improve on existing techniques.

Manifold Diffusion Geometry: Curvature, Tangent Spaces, and Dimension

TL;DR

This work addresses robust geometric inference from finite, noisy samples that lie near manifolds. It develops diffusion-geometry–based estimators for the Laplacian, carré du champ, tangent spaces, pointwise and global dimension, and curvature tensors (Riemann, Ricci, scalar), using diffusion maps with variable bandwidth kernels. The approach yields parameter-free, noise-robust estimators that perform comparably to state-of-the-art on clean data but significantly outperform competitors under noise or sparsity, including improved dimension estimates and feasible curvature estimation. By grounding geometric quantities in diffusion geometry and the second fundamental form, the method enables reliable geometric ML on real-world, imperfect data and suggests directions for extending these tools to non-manifold data and higher-dimensional settings.

Abstract

We introduce novel estimators for computing the curvature, tangent spaces, and dimension of data from manifolds, using tools from diffusion geometry. Although classical Riemannian geometry is a rich source of inspiration for geometric data analysis and machine learning, it has historically been hard to implement these methods in a way that performs well statistically. Diffusion geometry lets us develop Riemannian geometry methods that are accurate and, crucially, also extremely robust to noise and low-density data. The methods we introduce here are comparable to the existing state-of-the-art on ideal dense, noise-free data, but significantly outperform them in the presence of noise or sparsity. In particular, our dimension estimate improves on the existing methods on a challenging benchmark test when even a small amount of noise is added. Our tangent space and scalar curvature estimates do not require parameter selection and substantially improve on existing techniques.

Paper Structure

This paper contains 25 sections, 2 theorems, 32 equations, 7 figures, 2 tables.

Key Result

Theorem 3.1

Let $q \in L^1(\mathcal{M}) \cap C^3(\mathcal{M})$ be a density that is bounded above on $\mathcal{M}$ and let $X$ be sampled independently with distribution $q$. If $f \in L^2(\mathcal{M},q) \cap C^3(\mathcal{M})$ is a smooth function, $\hat{f}$ is the vector $\hat{f}_i = f(p_i)$, and $p_i \in X$ i up to rescaling $\hat{\Delta}_\epsilon$ by a constant positive factor $c$.

Figures (7)

  • Figure 1: The pointwise dimensions, tangent spaces, and scalar curvature of a manifold.
  • Figure 2: Pointwise diffusion dimension. The dimension is estimated at each point for data in 2d (top row) and 3d (bottom row). This process generally identifies 1-dimensional boundaries as 1-dimensional, and hard corners as 0-dimensional. These examples are given without noise for clarity, although this process is very robust to noise.
  • Figure 3: Tangent spaces of data. We compute the 1d (top row) and 2d (bottom row) tangent space for each point. When the data are from a manifold, we robustly recover the tangent bundle even with large amounts of noise. When the data is not a manifold (top centre example), these diffusion tangents measure the direction of greatest heat flow along the object.
  • Figure 4: Diffusion Geometry vs Local PCA. We compute the tangent space for a torus in $\mathbb{R}^3$ with different sampling densities and noise levels, using tangent diffusion and LPCA. We measure accuracy at a point by finding the error angle between the normal vector to the computed tangent space and the true normal at that point. The grids contain the average error angle in degrees, averaged over 10 runs: 0° means perfect accuracy over the whole torus and over 45° means the tangent spaces are random. LPCA is computed for $k=5$ and $k=100$ nearest neighbours: both values perform well in places but no fixed value of $k$ is always good. Diffusion geometry is comparable to or better than LPCA and does not need parameter selection.
  • Figure 5: Scalar curvature of surfaces. We compute the scalar curvature with diffusion geometry. When the surface is locally spherical, the curvature is positive (e.g. on most of the dented sphere). When the surface is locally hyperbolic the curvature is negative (e.g. the hyperboloid, which is negatively curved everywhere, and the rim of the dent on the sphere). Zero curvature means the surface looks like Euclidean space, such as on the Swiss roll, which is a wrapped but, crucially, not deformed rectangle.
  • ...and 2 more figures

Theorems & Definitions (5)

  • Theorem 3.1: From Corollary 1, berry2016variable
  • Corollary 3.2
  • proof
  • Remark 3.3
  • Remark 3.4