Persistence-based Modes Inference
Hugo Henneuse
TL;DR
This work develops a topological-data-analytic framework to infer multiple density modes from i.i.d. samples by coupling rough histogram-based level-set estimation with $H_{0}$-persistence diagrams. It introduces the piecewise-Hölder class $S_d(L,\alpha,\mu,R_{\mu},C,h_0)$ under a geometric $\mu$-reach condition and proves that the $H_{0}$-diagram can be estimated at minimax rates; above a critical mode-separation threshold, the method exactly recovers the number of modes and localizes them with near-optimal rates for the mode locations and their prominences. The approach yields consistent mode inference even in the presence of discontinuities near modes, and it provides a quantitative threshold $h_0$ that delineates when modes are detectable. A numerical illustration demonstrates robustness to irregularities and superiority over mean-shift in identifying the correct number of modes and their locations. Overall, the paper extends persistence-based modal analysis to broad, irregular densities and establishes minimax guarantees for both diagram estimation and multi-mode inference.
Abstract
We address the problem of estimating multiple modes of a multivariate density using persistent homology, a central tool in Topological Data Analysis. We introduce a method based on the preliminary estimation of the $H_0$-persistence diagram to infer the number of modes, their locations, and the corresponding local maxima. For broad classes of piecewise-continuous functions with geometric control on discontinuities loci, we identify a critical separation threshold between modes, also interpretable in our framework in terms of modes prominence, below which modes inference is impossible and above which our procedure achieves minimax optimal rates.
