Table of Contents
Fetching ...

CA-PCA: Manifold Dimension Estimation, Adapted for Curvature

Anna C. Gilbert, Kevin O'Neill

TL;DR

CA-PCA is developed, a version of local PCA based instead on a calibration of a quadratic embedding, acknowledging the curvature of the underlying manifold, and shows that this adaptation improves the estimator in a wide range of settings.

Abstract

The success of algorithms in the analysis of high-dimensional data is often attributed to the manifold hypothesis, which supposes that this data lie on or near a manifold of much lower dimension. It is often useful to determine or estimate the dimension of this manifold before performing dimension reduction, for instance. Existing methods for dimension estimation are calibrated using a flat unit ball. In this paper, we develop CA-PCA, a version of local PCA based instead on a calibration of a quadratic embedding, acknowledging the curvature of the underlying manifold. Numerous careful experiments show that this adaptation improves the estimator in a wide range of settings.

CA-PCA: Manifold Dimension Estimation, Adapted for Curvature

TL;DR

CA-PCA is developed, a version of local PCA based instead on a calibration of a quadratic embedding, acknowledging the curvature of the underlying manifold, and shows that this adaptation improves the estimator in a wide range of settings.

Abstract

The success of algorithms in the analysis of high-dimensional data is often attributed to the manifold hypothesis, which supposes that this data lie on or near a manifold of much lower dimension. It is often useful to determine or estimate the dimension of this manifold before performing dimension reduction, for instance. Existing methods for dimension estimation are calibrated using a flat unit ball. In this paper, we develop CA-PCA, a version of local PCA based instead on a calibration of a quadratic embedding, acknowledging the curvature of the underlying manifold. Numerous careful experiments show that this adaptation improves the estimator in a wide range of settings.
Paper Structure (23 sections, 5 theorems, 88 equations, 9 figures, 1 algorithm)

This paper contains 23 sections, 5 theorems, 88 equations, 9 figures, 1 algorithm.

Key Result

Lemma 2.1

\newlabellemma:distribution for ball0 Let $W$ be a $d$-dimensional subspace of $\mathbb{R}^D$ ($D\ge d$) and let $\nu$ denote the $d$-dimensional Lebesgue measure on $W$ intersected with the unit ball of $\mathbb{R}^D$. Then

Figures (9)

  • Figure 1: Results for Synthetic Data
  • Figure 2: Synthetic Manifolds in Higher Dimensions
  • Figure 3: Standard Deviation of Estimates for Synthetic Manifolds in Higher Dimensions
  • Figure 4: Standard Deviation of Estimates for Synthetic Manifolds in Higher Dimensions
  • Figure 5: Results for Simulated Data
  • ...and 4 more figures

Theorems & Definitions (7)

  • Lemma 2.1: Lemma 6.1 in lim2021tangent, Lemma 13 in arias2017spectral
  • Proposition 3.1
  • Lemma 3.2
  • Proof 1
  • Theorem A.1
  • Lemma A.2
  • Proof 2