Table of Contents
Fetching ...

Robustness of Persistent Topological Features and Minimum Homological Cuts

Pepijn Roos Hoefgeest, Lucas Slot

Abstract

Persistent homology is a popular method for computing topological features of (metric) data. Standard approaches based on the Čech or Rips filtration are stable under small perturbations of the data, but highly sensitive to outliers. This lack of robustness has been frequently addressed in the literature. In this paper, we take a novel perspective by asking the following question: When can we guarantee that an observed persistent feature (a bar) is inherent to the underlying data in the presence of a limited number of unknown, arbitrary outliers. We formalize this question by introducing the notion of \emph{adversarial robustness}, and study the problem of deciding whether a given bar in the barcode of a filtered simplicial complex is adversarially robust. We show that this problem is essentially equivalent to a homological variant of the minimum cut problem in simplicial complexes, which we believe to be of independent interest. As our main technical contribution, we provide the first computational complexity results for this problem, consisting of an efficient algorithm in $0$-dimensional homology, NP-hardness for the general problem, and an efficient algorithm for codimension-$1$ in $n$-dimensional complexes embedded in $\mathbb{R}^n$. We also analyze its natural linear programming relaxation, whose dual defines a homological analog of the max-flow problem in graphs. We show that a max-flow/min-cut theorem does not hold in our setting, implying that the LP relaxation is not tight in general. Finally, in the special case of the Rips filtration, we provide a global heuristic based on the Hausdorff distance that guarantees adversarial robustness of sufficiently long bars. This connects adversarial robustness to standard stability theorems in persistent homology.

Robustness of Persistent Topological Features and Minimum Homological Cuts

Abstract

Persistent homology is a popular method for computing topological features of (metric) data. Standard approaches based on the Čech or Rips filtration are stable under small perturbations of the data, but highly sensitive to outliers. This lack of robustness has been frequently addressed in the literature. In this paper, we take a novel perspective by asking the following question: When can we guarantee that an observed persistent feature (a bar) is inherent to the underlying data in the presence of a limited number of unknown, arbitrary outliers. We formalize this question by introducing the notion of \emph{adversarial robustness}, and study the problem of deciding whether a given bar in the barcode of a filtered simplicial complex is adversarially robust. We show that this problem is essentially equivalent to a homological variant of the minimum cut problem in simplicial complexes, which we believe to be of independent interest. As our main technical contribution, we provide the first computational complexity results for this problem, consisting of an efficient algorithm in -dimensional homology, NP-hardness for the general problem, and an efficient algorithm for codimension- in -dimensional complexes embedded in . We also analyze its natural linear programming relaxation, whose dual defines a homological analog of the max-flow problem in graphs. We show that a max-flow/min-cut theorem does not hold in our setting, implying that the LP relaxation is not tight in general. Finally, in the special case of the Rips filtration, we provide a global heuristic based on the Hausdorff distance that guarantees adversarial robustness of sufficiently long bars. This connects adversarial robustness to standard stability theorems in persistent homology.
Paper Structure (22 sections, 15 theorems, 17 equations, 8 figures)

This paper contains 22 sections, 15 theorems, 17 equations, 8 figures.

Key Result

Proposition 1.4

Let $p \geq 0$, and let $B \in \mathcal{B}(\mathrm{PH}_p(\mathcal{K}))$ be a bar in the barcode of a simplex-wise filtration $\mathcal{K}$ of a simplicial complex $K$. Let $K_{B}$, $\tau_{B}$ be its pre-death complex and death simplex. Then, $B$ is $k$-adversarially robust (in degree $s$) if, and on Thus, $B$ is $k$-adversarially robust iff all homological $s$-cuts of $[\tau_{B}]$ in $K_{B}$ have

Figures (8)

  • Figure 1: A simplicial complex $K$ with cycles $c_{\rm left}, c_{\rm right} \in C_1(K; \mathbb{R})$ drawn in blue single arrows and red double arrows, respectively (all coefficients equal to $1$). The classes $[c_{\rm left}]$ and $[c_{\rm right}]$ generate $H_1(K; \mathbb{R}) \cong \mathbb{R}^2$. On the right: two subcomplexes obtained by removing subsets $C_1, C_2 \subseteq K^{(1)}$ (dashed) from $K$, respectively. Note that $C_1$ is an edge cut for $[c_{\rm left}]$ and $[c_{\rm left} + c_{\rm right}]$, but not for $[c_{\rm right}]$. On the other hand, $C_2$ is a $1$-cut for both $[c_{\rm left}]$ and $[c_{\rm right}]$, but not for $[c_{\rm left} + c_{\rm right}]$.
  • Figure 2: Four data sets in $\mathbb{R}^2$, each of size $100$, whose Rips filtrations each induce a $1$-dimensional persistent feature. For $X_1$ and $X_2$, these features are not $k$-adversarially robust for $k=10$, evidenced by the subsets $A_1 \subseteq X_1$ and $A_2 \subseteq X_2$ marked in red. The points in $A_2$ are quite dense, and so standard subsampling techniques will likely not remove them. On the other hand, the feature is $10$-robust in $X_3$, as its length exceeds $\mathcal{H}_{X_3, 10}$, which equals $d_H(X_3 \setminus A_3, X_3)$ for the subset $A_3 \subseteq X_3$ marked in red (cf. \ref{['THM:main:HH']}). In $X_4$, the feature induced by the densely sampled circle is $10$-robust, even though its bar has length strictly less than $\mathcal{H}_{X_4, 10}$ (evidenced by the set $A_4$ marked in red).
  • Figure 3: Surfaces $F_{u_1}$ and $F_{u_2}$ involved in the construction of $X(U,\mathcal{S)}$ for $(U,\mathcal{S}) = (\{u_1,u_2,u_3,u_4\}, \{S_1,S_2,S_3\})$ with $S_1 = \{u_1,u_2,u_3\}, S_2= \{u_1,u_2,u_4\}, S_3 = \{u_1,u_3,u_4\}$. The surfaces are glued along boundary components with the same label, in an orientation preserving way.
  • Figure 4: Left: The embedded simplicial complex $K$ of \ref{['FIG:CUTEXAMPLE']} with a class $\gamma \in H_1(K)$ (blue, single arrows). Center: The (extended) dual graph of $K$. The vertices $V_{\mathbb{R}^n \setminus K} = \{ v_1, v_2, v_\infty\}$ and edges added to obtain $\mathcal{G}_{K}^*$ from $\mathcal{G}_{K}$ are in red. Only some of the edges incident to $v_\infty$ are drawn. Right: A shortest path $v_1 \rightarrow v_\infty$ (green, single arrows) and a shortest path $v_1 \rightarrow v_2$ (purple, double arrows) in $\mathcal{G}_{K}^*$. These correspond to the (minimum) $1$-cuts for $\gamma$ depicted in \ref{['FIG:CUTEXAMPLE']}, see \ref{['PROP:cut_alex_dual_complete']}. The paths are directed for visual clarity only.
  • Figure 5: The simplicial complex and cycles $c$ (red, double arrows) and $c_1, c_2$ (green, single arrows, on the left and right, respectively) of \ref{['EXMP:mincutmaxflow']}.
  • ...and 3 more figures

Theorems & Definitions (43)

  • Definition 1.2: Adversarial robustness
  • Definition 1.3: homological cuts
  • Proposition 1.4
  • Theorem 1.6
  • Theorem 1.7
  • Theorem 1.8
  • Theorem 3.1: Induced matching theorem Bauer2013InducedMA
  • proof : Proof of \ref{['THM:main:LOC']}
  • Proposition 4.1
  • proof
  • ...and 33 more