Geometric embeddings of spaces of persistence diagrams with explicit distortions
Atish Mitra, Ziga Virk
TL;DR
The paper tackles the challenge of putting persistence diagrams into a Hilbert space to enable statistics and data analysis by constructing explicit, $1$-Lipschitz embeddings of the Bottleneck space $\mathcal{D}_n$ into $\ell^2$ and into finite-dimensional Euclidean spaces with quantified distortions. The approach centers on multi-scale, landmark-based embeddings built from covers of $\mathcal{D}_n$, each component depending on bottleneck distances to landmark diagrams, and then coherently assembling these components to achieve coarse and uniform embeddings, with explicit distortion functions. Key contributions include the general framework for multi-scale embeddings (Theorem ThmGluingScales) with illustrative distortion controls, a bounded-domain embedding with an explicit distortion bound, and a concrete injective finite-dimensional map on bounded domains. The work situates these constructions within quantitative dimension theory and discusses practical considerations, limitations (e.g., embedding multiplicity and non-injectivity at fixed scales), and avenues for future implementation and optimization to compare with existing persistence diagram vectorizations. Overall, the paper provides implementable, distortion-controlled vectorizations that bridge topological summaries and standard statistical tools, enabling more robust analyses of diagram collections in TDA.
Abstract
Let $n$ be a positive integer. We provide an explicit geometrically motivated $1$-Lipschitz map from the space of persistence diagrams on $n$ points (equipped with the Bottleneck distance) into the Hilbert space $\ell^2$. Such maps are a crucial step in topological data analysis, allowing the use of statistical methods (and thus data analysis) on collections of persistence diagrams. The main advantage of our maps as compared to most of the other such vectorizations is that they are coarse and uniform embeddings with explicit distortion functions. This allows us to control the amount of geometric information lost through their application. Furthermore, we also provide an explicit $1$-Lipschitz map from the space of persistence diagrams on $n$ points on a bounded domain into a Euclidean space with an explicit distortion function. We conclude with a differently flavored embedding of the space of persistence diagrams on $n$ points on a bounded domain into $\mathbb{R}^{n(n+1)}$. The maps we construct are fairly simple, with each component depending only on the bottleneck distance to the corresponding ``landmark" persistence diagram. Due to geometric motivation from classical dimension theory, our methods are best described as quantitative dimension theory.
