Table of Contents
Fetching ...

Persistence-based Modes Inference

Hugo Henneuse

TL;DR

This work develops a topological-data-analytic framework to infer multiple density modes from i.i.d. samples by coupling rough histogram-based level-set estimation with $H_{0}$-persistence diagrams. It introduces the piecewise-Hölder class $S_d(L,\alpha,\mu,R_{\mu},C,h_0)$ under a geometric $\mu$-reach condition and proves that the $H_{0}$-diagram can be estimated at minimax rates; above a critical mode-separation threshold, the method exactly recovers the number of modes and localizes them with near-optimal rates for the mode locations and their prominences. The approach yields consistent mode inference even in the presence of discontinuities near modes, and it provides a quantitative threshold $h_0$ that delineates when modes are detectable. A numerical illustration demonstrates robustness to irregularities and superiority over mean-shift in identifying the correct number of modes and their locations. Overall, the paper extends persistence-based modal analysis to broad, irregular densities and establishes minimax guarantees for both diagram estimation and multi-mode inference.

Abstract

We address the problem of estimating multiple modes of a multivariate density using persistent homology, a central tool in Topological Data Analysis. We introduce a method based on the preliminary estimation of the $H_0$-persistence diagram to infer the number of modes, their locations, and the corresponding local maxima. For broad classes of piecewise-continuous functions with geometric control on discontinuities loci, we identify a critical separation threshold between modes, also interpretable in our framework in terms of modes prominence, below which modes inference is impossible and above which our procedure achieves minimax optimal rates.

Persistence-based Modes Inference

TL;DR

This work develops a topological-data-analytic framework to infer multiple density modes from i.i.d. samples by coupling rough histogram-based level-set estimation with -persistence diagrams. It introduces the piecewise-Hölder class under a geometric -reach condition and proves that the -diagram can be estimated at minimax rates; above a critical mode-separation threshold, the method exactly recovers the number of modes and localizes them with near-optimal rates for the mode locations and their prominences. The approach yields consistent mode inference even in the presence of discontinuities near modes, and it provides a quantitative threshold that delineates when modes are detectable. A numerical illustration demonstrates robustness to irregularities and superiority over mean-shift in identifying the correct number of modes and their locations. Overall, the paper extends persistence-based modal analysis to broad, irregular densities and establishes minimax guarantees for both diagram estimation and multi-mode inference.

Abstract

We address the problem of estimating multiple modes of a multivariate density using persistent homology, a central tool in Topological Data Analysis. We introduce a method based on the preliminary estimation of the -persistence diagram to infer the number of modes, their locations, and the corresponding local maxima. For broad classes of piecewise-continuous functions with geometric control on discontinuities loci, we identify a critical separation threshold between modes, also interpretable in our framework in terms of modes prominence, below which modes inference is impossible and above which our procedure achieves minimax optimal rates.
Paper Structure (24 sections, 22 theorems, 161 equations, 10 figures, 1 table)

This paper contains 24 sections, 22 theorems, 161 equations, 10 figures, 1 table.

Key Result

Theorem \citation{Chazal2009}

Let $\mathbb{V}$ and $\mathbb{W}$ two $q-$tame persistence modules. If $\mathbb{V}$ and $\mathbb{W}$ are $\varepsilon-$interleaved then,

Figures (10)

  • Figure 1: 1D illustration of the link between local maxima and $H_{0}$ persistence diagram. The birth times $b_{1}$, ..., $b_{6}$ corresponds to the local maxima of the function.
  • Figure 2: 2D example with 2 closest points
  • Figure 3: (a) displays a partition $M_{1}$,..., $M_{6}$ such that $\operatorname{reach}_{1}\left(\partial M_{1}\cup...\cup \partial M_{6}\right)>0$. (b) displays a partition $M_{1}$,..., $M_{6}$ such that $\operatorname{reach}_{1}\left(\partial M_{1}\cup...\cup \partial M_{6}\right)=0$ (in red are highlighted problematic points) but for sufficiently small $\mu>0$, $\operatorname{reach}_{\mu}\left(\partial M_{1}\cup...\cup \partial M_{6}\right)>0$.
  • Figure 4: Illustration of $\mu-$medial axis for a set $K$. In black is represented the $1-$medial axis, in red the $0-$medial axis and in blue the $\mu-$medial axis for a small $0<\mu<1/2$.
  • Figure 5: Super level sets filtration of $f(x)=x\cos(4\pi x)$ over $[0,1]$ and the associated $H_{0}$-persistence diagram.
  • ...and 5 more figures

Theorems & Definitions (42)

  • Theorem \citation{Chazal2009}
  • Theorem ["sup norm stability"]
  • Proposition 4.1
  • Theorem 4.2
  • Theorem 4.3
  • Proposition 4.4
  • Lemma 5.1
  • Lemma 5.2
  • Lemma 5.3
  • Lemma 5.4
  • ...and 32 more