Table of Contents
Fetching ...

Diffeomorphic interpolation for efficient persistence-based topological optimization

Mathieu Carriere, Marc Theveneau, Théo Lacombe

TL;DR

This work tackles the extreme sparsity of gradients in persistence-based topological optimization by introducing a diffeomorphic interpolation that extends the sparse gradient $\nabla L(X)$ to a smooth vector field $\tilde{v}$ on $\mathbb{R}^d$, ensuring descent via a flow $\dot X=-\tilde{v}_t(X)$. Constructed in a Gaussian RKHS, $\tilde{v}$ interpolates the nonzero gradient entries on index set $I$ and preserves topological updates while providing a global, Lipschitz-bounded operator. The approach scales with subsampling, enabling linear-time updates to the full input and allowing re-use on new data, including latent spaces of pre-trained black-box autoencoders, with the ability to sample topologically-regular representations by reversing the learned flow. Empirical results show faster convergence than vanilla gradients, substantial scalability to large point clouds (e.g., the Stanford Bunny) when combined with subsampling, and successful regularization of latent spaces in black-box AE models, yielding improved interpretability. The method offers a practical, theoretically grounded route to integrate topology-aware priors into large-scale data analysis and model regularization tasks.

Abstract

Topological Data Analysis (TDA) provides a pipeline to extract quantitative topological descriptors from structured objects. This enables the definition of topological loss functions, which assert to what extent a given object exhibits some topological properties. These losses can then be used to perform topological optimizationvia gradient descent routines. While theoretically sounded, topological optimization faces an important challenge: gradients tend to be extremely sparse, in the sense that the loss function typically depends on only very few coordinates of the input object, yielding dramatically slow optimization schemes in practice.Focusing on the central case of topological optimization for point clouds, we propose in this work to overcome this limitation using diffeomorphic interpolation, turning sparse gradients into smooth vector fields defined on the whole space, with quantifiable Lipschitz constants. In particular, we show that our approach combines efficiently with subsampling techniques routinely used in TDA, as the diffeomorphism derived from the gradient computed on a subsample can be used to update the coordinates of the full input object, allowing us to perform topological optimization on point clouds at an unprecedented scale. Finally, we also showcase the relevance of our approach for black-box autoencoder (AE) regularization, where we aim at enforcing topological priors on the latent spaces associated to fixed, pre-trained, black-box AE models, and where we show thatlearning a diffeomorphic flow can be done once and then re-applied to new data in linear time (while vanilla topological optimization has to be re-run from scratch). Moreover, reverting the flow allows us to generate data by sampling the topologically-optimized latent space directly, yielding better interpretability of the model.

Diffeomorphic interpolation for efficient persistence-based topological optimization

TL;DR

This work tackles the extreme sparsity of gradients in persistence-based topological optimization by introducing a diffeomorphic interpolation that extends the sparse gradient to a smooth vector field on , ensuring descent via a flow . Constructed in a Gaussian RKHS, interpolates the nonzero gradient entries on index set and preserves topological updates while providing a global, Lipschitz-bounded operator. The approach scales with subsampling, enabling linear-time updates to the full input and allowing re-use on new data, including latent spaces of pre-trained black-box autoencoders, with the ability to sample topologically-regular representations by reversing the learned flow. Empirical results show faster convergence than vanilla gradients, substantial scalability to large point clouds (e.g., the Stanford Bunny) when combined with subsampling, and successful regularization of latent spaces in black-box AE models, yielding improved interpretability. The method offers a practical, theoretically grounded route to integrate topology-aware priors into large-scale data analysis and model regularization tasks.

Abstract

Topological Data Analysis (TDA) provides a pipeline to extract quantitative topological descriptors from structured objects. This enables the definition of topological loss functions, which assert to what extent a given object exhibits some topological properties. These losses can then be used to perform topological optimizationvia gradient descent routines. While theoretically sounded, topological optimization faces an important challenge: gradients tend to be extremely sparse, in the sense that the loss function typically depends on only very few coordinates of the input object, yielding dramatically slow optimization schemes in practice.Focusing on the central case of topological optimization for point clouds, we propose in this work to overcome this limitation using diffeomorphic interpolation, turning sparse gradients into smooth vector fields defined on the whole space, with quantifiable Lipschitz constants. In particular, we show that our approach combines efficiently with subsampling techniques routinely used in TDA, as the diffeomorphism derived from the gradient computed on a subsample can be used to update the coordinates of the full input object, allowing us to perform topological optimization on point clouds at an unprecedented scale. Finally, we also showcase the relevance of our approach for black-box autoencoder (AE) regularization, where we aim at enforcing topological priors on the latent spaces associated to fixed, pre-trained, black-box AE models, and where we show thatlearning a diffeomorphic flow can be done once and then re-applied to new data in linear time (while vanilla topological optimization has to be re-run from scratch). Moreover, reverting the flow allows us to generate data by sampling the topologically-optimized latent space directly, yielding better interpretability of the model.
Paper Structure (25 sections, 2 theorems, 13 equations, 11 figures, 1 table, 1 algorithm)

This paper contains 25 sections, 2 theorems, 13 equations, 11 figures, 1 table, 1 algorithm.

Key Result

Proposition 3.1

For each $t \geq 0$, it holds that $\frac{\mathrm{d} L(\tilde{X}(t))}{\mathrm{d} t} = - \| \nabla L(\tilde{X}(t))\|^2 \leq 0.$

Figures (11)

  • Figure 1: Illustration of the Vietoris-Rips filtration on a point cloud in $\mathbb{R}^d$, focusing on one-dimensional topological features (loops). When the filtration parameter $t$ increases, loops appear and disappear in the filtration. These values are accounted in the resulting persistence diagram (right).
  • Figure 2: (blue) A point cloud $X$, and (black) the negative gradient $-\nabla L(X)$ of a simplification loss which aims at destroying the loop by collapsing the circle (reduce the loop's death time) and tearing it (increase the birth time). While $\nabla L(X)$ only affects four points in $X$, the diffeomorphic interpolation $\tilde{v}(X)$ (orange, $\sigma=0.1$) is defined on $\mathbb{R}^d$, hence extends smoothly to other points in $X$.
  • Figure 3: Showcase of the usefulness of subsampling combined with diffeomorphic interpolations to minimize a topological simplification loss, with parameters $\lambda = 0.1$, $s=50$, $n=500$. $(a)$ Initial point cloud $X$ (blue), subsample $X'$ (red), vanilla topological gradient on the subsample (black) and corresponding diffeomorphic interpolation (orange). $(b)$ and $(c)$, the point cloud $X_t$ after running $t=100$ and $t=500$ steps of vanilla gradient descent. $(d)$ the point cloud $X_t$ after running $t=100$ steps of diffeomorphic gradient descent.
  • Figure 4: (Top) From left to right: initial point cloud, and final point cloud for the different flows. (Bottom) Evolution of the loss with respect to the number of iterations and with respect to running time.
  • Figure 5: From left to right: initial Stanford bunny $X_0$, the point cloud after $1,000$ epochs of vanilla topological gradient descent (barely any changes), the point cloud after 200 epochs of diffeomorphic gradient descent, after 1,000 epochs, and eventually the evolution of losses for both methods over iterations.
  • ...and 6 more figures

Theorems & Definitions (4)

  • Proposition 3.1
  • proof
  • Proposition 3.2
  • proof : Proof of \ref{['prop:LipschitzCste']}