Table of Contents
Fetching ...

Persistence-based topological optimization: a survey

Mathieu Carriere, Yuichi Ike, Théo Lacombe, Naoki Nishikawa

Abstract

Computational topology provides a tool, persistent homology, to extract quantitative descriptors from structured objects (images, graphs, point clouds, etc). These descriptors can then be involved in optimization problems, typically as a way to incorporate topological priors or to regularize machine learning models. This is usually achieved by minimizing adequate, topologically-informed losses based on these descriptors, which, in turn, naturally raises theoretical and practical questions about the possibility of optimizing such loss functions using gradient-based algorithms. This has been an active research field in the topological data analysis community over the last decade, and various techniques have been developed to enable optimization of persistence-based loss functions with gradient descent schemes. This survey presents the current state of this field, covering its theoretical foundations, the algorithmic aspects, and showcasing practical uses in several applications. It includes a detailed introduction to persistence theory and, as such, aims at being accessible to mathematicians and data scientists newcomers to the field. It is accompanied by an open-source library which implements the different approaches covered in this survey, providing a convenient playground for researchers to get familiar with the field.

Persistence-based topological optimization: a survey

Abstract

Computational topology provides a tool, persistent homology, to extract quantitative descriptors from structured objects (images, graphs, point clouds, etc). These descriptors can then be involved in optimization problems, typically as a way to incorporate topological priors or to regularize machine learning models. This is usually achieved by minimizing adequate, topologically-informed losses based on these descriptors, which, in turn, naturally raises theoretical and practical questions about the possibility of optimizing such loss functions using gradient-based algorithms. This has been an active research field in the topological data analysis community over the last decade, and various techniques have been developed to enable optimization of persistence-based loss functions with gradient descent schemes. This survey presents the current state of this field, covering its theoretical foundations, the algorithmic aspects, and showcasing practical uses in several applications. It includes a detailed introduction to persistence theory and, as such, aims at being accessible to mathematicians and data scientists newcomers to the field. It is accompanied by an open-source library which implements the different approaches covered in this survey, providing a convenient playground for researchers to get familiar with the field.

Paper Structure

This paper contains 55 sections, 18 theorems, 58 equations, 19 figures, 3 tables, 9 algorithms.

Key Result

Theorem 2.17

Let $\mathcal{K}=(K_t)_{t\in\mathbb{R}}$ be a filtration of a finite simplicial complex $K$. For any $t\in\mathbb{R}$, we let $\partial_{p,t} \coloneqq \partial_p|_{C_p(K_t)}$, i.e., the restriction of $\partial_p$ to the vector subspace $C_p(K_t)\subseteq C_p(K)$. Moreover, for each $z\in \operator Additionally, the multiset $\{(b_i, d_i)\}_{i=1}^m$ that satisfies the conditions above is unique (

Figures (19)

  • Figure 1.1: Standard topological optimization scheme typically used in deep learning pipelines, with examples on point cloud, graph and image data. Note that step $1$ needs not be at the beginning of the pipeline, and can be incorporated at any stage (this happens when input data are computed as the outputs of some other models, such as, e.g., image filters automatically computed by CNNs). The blue square is the main theoretical question that we discuss in this survey.
  • Figure 2.1: Boundary operator
  • Figure 2.2: A simplicial complex (whose geometric realization is that of a torus embedded in $\mathbb{R}^3$).
  • Figure 2.3: Illustration of the Čech filtration and resulting persistence diagram. From left to right, an input point cloud (sample of $n=1000$ points on a torus), increasing sublevel sets of the distance function to that point cloud, i.e., $\bigcup_{x \in X} B(x,t)$, and eventually on the right the corresponding persistence diagram (see \ref{['def:PD']}) Colors in the persistence diagram correspond to different homology dimension; the prominent point in red (death = $+\infty$) accounts for the unique connected component of the (underlying) manifold. The two prominent points in blue correspond to the two generating loops in $H_1$ (recall \ref{['example:torus_CS']})---their death coordinate indicating a "small" radius $r_1 = 2$ and a large radius of $r_2 = 5$, and the green one accounts for the cavity ($H_2$, also filled in at $t=2$). Points closer to the diagonal $\{b=d\}$ (in particular, red and blue ones) correspond to less persistent features accounting for the sampling (they would vanish, i.e., collapse on the diagonal, when $n \to +\infty$).
  • Figure 2.4: Partial matching distance between PDs.
  • ...and 14 more figures

Theorems & Definitions (78)

  • Definition 2.1
  • Example 2.2
  • Definition 2.3
  • Example 2.4
  • Example 2.5
  • Remark 2.6
  • Definition 2.7
  • Remark 2.8
  • Example 2.9: Čech filtration on point clouds
  • Remark 2.10
  • ...and 68 more