Table of Contents
Fetching ...

Steinhaus Filtration and Stable Paths in the Mapper

Dustin L. Arendt, Matthew Broussard, Bala Krishnamoorthy, Nathaniel Saul, Amber Thrall

TL;DR

This work introduces Steinhaus filtration, a single-cover topological filtration built from a generalized Steinhaus distance to enable persistent topology analysis without a predefined metric. It proves stability for finite covers, relates Steinhaus filtration to Čech in dimension one and to the VR filtration for the 1-skeleton, while showing non-equivalence with Čech in higher dimensions. The authors develop a theory of stable paths within the Steinhaus filtration and provide algorithms, with applications to recommendation systems and explainable ML via a Steinhaus Mapper filtration. The results offer interpretable, metric-free insights into data structure and transitions between subpopulations, supported by code and concrete experiments.

Abstract

We define a new filtration called the Steinhaus filtration built from a single cover based on a generalized Steinhaus distance, a generalization of Jaccard distance. The homology persistence module of a Steinhaus filtration with infinitely many cover elements may not be $q$-tame, even when the covers are in a totally bounded space. While this may pose a challenge to derive stability results, we show that the Steinhaus filtration is stable when the cover is finite. We show that while the Čech and Steinhaus filtrations are not isomorphic in general, they are isomorphic for a finite point set in dimension one. Furthermore, the VR filtration completely determines the $1$-skeleton of the Steinhaus filtration in arbitrary dimension. We then develop a language and theory for stable paths within the Steinhaus filtration. We demonstrate how the framework can be applied to several applications where a standard metric may not be defined but a cover is readily available. We introduce a new perspective for modeling recommendation system datasets. As an example, we look at a movies dataset and we find the stable paths identified in our framework represent a sequence of movies constituting a gentle transition and ordering from one genre to another. For explainable machine learning, we apply the Mapper algorithm for model induction by building a filtration from a single Mapper complex, and provide explanations in the form of stable paths between subpopulations. For illustration, we build a Mapper complex from a supervised machine learning model trained on the FashionMNIST dataset. Stable paths in the Steinhaus filtration provide improved explanations of relationships between subpopulations of images.

Steinhaus Filtration and Stable Paths in the Mapper

TL;DR

This work introduces Steinhaus filtration, a single-cover topological filtration built from a generalized Steinhaus distance to enable persistent topology analysis without a predefined metric. It proves stability for finite covers, relates Steinhaus filtration to Čech in dimension one and to the VR filtration for the 1-skeleton, while showing non-equivalence with Čech in higher dimensions. The authors develop a theory of stable paths within the Steinhaus filtration and provide algorithms, with applications to recommendation systems and explainable ML via a Steinhaus Mapper filtration. The results offer interpretable, metric-free insights into data structure and transitions between subpopulations, supported by code and concrete experiments.

Abstract

We define a new filtration called the Steinhaus filtration built from a single cover based on a generalized Steinhaus distance, a generalization of Jaccard distance. The homology persistence module of a Steinhaus filtration with infinitely many cover elements may not be -tame, even when the covers are in a totally bounded space. While this may pose a challenge to derive stability results, we show that the Steinhaus filtration is stable when the cover is finite. We show that while the Čech and Steinhaus filtrations are not isomorphic in general, they are isomorphic for a finite point set in dimension one. Furthermore, the VR filtration completely determines the -skeleton of the Steinhaus filtration in arbitrary dimension. We then develop a language and theory for stable paths within the Steinhaus filtration. We demonstrate how the framework can be applied to several applications where a standard metric may not be defined but a cover is readily available. We introduce a new perspective for modeling recommendation system datasets. As an example, we look at a movies dataset and we find the stable paths identified in our framework represent a sequence of movies constituting a gentle transition and ordering from one genre to another. For explainable machine learning, we apply the Mapper algorithm for model induction by building a filtration from a single Mapper complex, and provide explanations in the form of stable paths between subpopulations. For illustration, we build a Mapper complex from a supervised machine learning model trained on the FashionMNIST dataset. Stable paths in the Steinhaus filtration provide improved explanations of relationships between subpopulations of images.

Paper Structure

This paper contains 12 sections, 10 theorems, 31 equations, 13 figures, 1 table, 1 algorithm.

Key Result

Theorem 2.5

The Steinhaus nerve of a cover $\mathcal{U}$ is a filtered simplicial complex.

Figures (13)

  • Figure 1: Mapper (Right) constructed on a noisy point set sampled from a circle (Left). The cover consists of three overlapping intervals (in blue, green, red) covering the range of $y$-coordinate values of the points.
  • Figure 2: A cover with $7$ elements, and its nerve (left column). The cyan and green vertices are connected by a single edge generated by a single point in intersection of cyan and green cover elements. Removing this point gives the cover and nerve shown in the right column. The path from cyan to green node now has six edges.
  • Figure 3: A non $q$-tame Steinhaus filtered Mapper complex. The center boxes follow the pattern of blue, blue, red, red, blue, blue, etc. For each diamond in the complex there is a cycle that traverses around the diamond.
  • Figure 4: Nonisomorphic simplicial complexes for covers $\mathcal{U}$ (top row) and $\mathcal{V}$ (bottom row) with $d(\mathcal{U},\mathcal{V})=0$.
  • Figure 5: Persistence diagrams for the Vietoris-Rips filtration and the approximate Steinhaus filtration of a set of uniformly sampled points in the plane.
  • ...and 8 more figures

Theorems & Definitions (28)

  • Definition 2.1: Steinhaus Distance Marczewski1958
  • Definition 2.2: Generalized Steinhaus distance
  • Definition 2.3: Nerve
  • Definition 2.4: Steinhaus Nerve
  • Theorem 2.5
  • proof
  • Lemma 3.1
  • proof
  • Definition 3.2
  • Definition 3.3: Bottleneck distance
  • ...and 18 more