Table of Contents
Fetching ...

A Wiener Process Perspective on Local Intrinsic Dimension Estimation Methods

Piotr Tempczyk, Łukasz Garncarek, Dominik Filipiak, Adam Kurpisz

TL;DR

This work introduces a Wiener-process perspective on Local Intrinsic Dimension (LID) estimation, reframing the initial perturbation of data as diffusion with density $ ho_t = p_S * \phi_t^D$ and linking the diffusion dynamics to intrinsic dimension via the asymptotic slope $eta(x) = \lim_{t\to0^+} \beta_t(x)$. It differentiates isolated and holistic Wiener-process-based LID algorithms, derives closed-form expressions for key LID-related quantities in several density settings, and proves the equivalence of reformulations of LIDL in terms of log-density slopes and diffusion derivatives. The paper analyzes biases and limits across diverse on-manifold densities (uniform, Gaussian, and piecewise/non-differentiable densities) and extends results to unions and mixtures of manifolds, highlighting how density geometry and diffusion time $t$ affect LID estimates. Overall, it provides a theoretical foundation to improve diffusion-based LID methods (e.g., LIDL, FLIPD) and suggests future work on curvature, multi-manifold interactions, and density-based Laplacian estimates to enhance scalability and accuracy in high-dimensional data.

Abstract

Local intrinsic dimension (LID) estimation methods have received a lot of attention in recent years thanks to the progress in deep neural networks and generative modeling. In opposition to old non-parametric methods, new methods use generative models to approximate diffused dataset density to scale the methods to high-dimensional datasets (e.g. images). In this paper, we investigate the recent state-of-the-art parametric LID estimation methods from the perspective of the Wiener process. We explore how these methods behave when their assumptions are not met. We give an extended mathematical description of those methods and their error as a function of the probability density of the data.

A Wiener Process Perspective on Local Intrinsic Dimension Estimation Methods

TL;DR

This work introduces a Wiener-process perspective on Local Intrinsic Dimension (LID) estimation, reframing the initial perturbation of data as diffusion with density and linking the diffusion dynamics to intrinsic dimension via the asymptotic slope . It differentiates isolated and holistic Wiener-process-based LID algorithms, derives closed-form expressions for key LID-related quantities in several density settings, and proves the equivalence of reformulations of LIDL in terms of log-density slopes and diffusion derivatives. The paper analyzes biases and limits across diverse on-manifold densities (uniform, Gaussian, and piecewise/non-differentiable densities) and extends results to unions and mixtures of manifolds, highlighting how density geometry and diffusion time affect LID estimates. Overall, it provides a theoretical foundation to improve diffusion-based LID methods (e.g., LIDL, FLIPD) and suggests future work on curvature, multi-manifold interactions, and density-based Laplacian estimates to enhance scalability and accuracy in high-dimensional data.

Abstract

Local intrinsic dimension (LID) estimation methods have received a lot of attention in recent years thanks to the progress in deep neural networks and generative modeling. In opposition to old non-parametric methods, new methods use generative models to approximate diffused dataset density to scale the methods to high-dimensional datasets (e.g. images). In this paper, we investigate the recent state-of-the-art parametric LID estimation methods from the perspective of the Wiener process. We explore how these methods behave when their assumptions are not met. We give an extended mathematical description of those methods and their error as a function of the probability density of the data.

Paper Structure

This paper contains 31 sections, 12 theorems, 65 equations, 2 figures.

Key Result

Lemma 4.1

For $t>0$ and $(x,y) \in \mathbb{R}^D$ we have

Figures (2)

  • Figure 1: LIDL estimates for Gaussian distributions.
  • Figure 2: LIDL estimates.

Theorems & Definitions (18)

  • Lemma 4.1: Lemma \ref{['lem:app:laplacian-rho-t-everywhere']}
  • Corollary 4.2
  • Proposition 5.1: Proposition \ref{['prop:app:equivalent-asymptotics']}
  • Proposition 5.2: Proposition \ref{['prop:app:equivalent-lidl']}
  • Proposition 5.3: Proposition \ref{['prop:app:beta_final']}
  • Lemma B.1
  • proof
  • Corollary B.2
  • Proposition C.1
  • proof
  • ...and 8 more