The Flood Complex: Large-Scale Persistent Homology on Millions of Points

Florian Graf; Paolo Pellizzoni; Martin Uray; Stefan Huber; Roland Kwitt

The Flood Complex: Large-Scale Persistent Homology on Millions of Points

Florian Graf, Paolo Pellizzoni, Martin Uray, Stefan Huber, Roland Kwitt

TL;DR

The Flood complex tackles the core scalability barrier of persistent homology on large-scale point clouds by coupling a small landmark subset with a flooding filtration on the landmark Delaunay triangulation. The method provides theoretical guarantees about stability and approximation quality, while leveraging GPU acceleration to achieve orders-of-magnitude speedups over Alpha-based pipelines on millions of points. Empirical results on both synthetic and real 3D data demonstrate accurate topological summaries and improved downstream object classification in complex geometric settings. This approach broadens the practical applicability of PH in machine learning pipelines that require fast, scalable topological features for large, high-dimensional datasets.

Abstract

We consider the problem of computing persistent homology (PH) for large-scale Euclidean point cloud data, aimed at downstream machine learning tasks, where the exponential growth of the most widely-used Vietoris-Rips complex imposes serious computational limitations. Although more scalable alternatives such as the Alpha complex or sparse Rips approximations exist, they often still result in a prohibitively large number of simplices. This poses challenges in the complex construction and in the subsequent PH computation, prohibiting their use on large-scale point clouds. To mitigate these issues, we introduce the Flood complex, inspired by the advantages of the Alpha and Witness complex constructions. Informally, at a given filtration value $r\geq 0$, the Flood complex contains all simplices from a Delaunay triangulation of a small subset of the point cloud $X$ that are fully covered by balls of radius $r$ emanating from $X$, a process we call flooding. Our construction allows for efficient PH computation, possesses several desirable theoretical properties, and is amenable to GPU parallelization. Scaling experiments on 3D point cloud data show that we can compute PH of up to dimension 2 on several millions of points. Importantly, when evaluating object classification performance on real-world and synthetic data, we provide evidence that this scaling capability is needed, especially if objects are geometrically or topologically complex, yielding performance superior to other PH-based methods and neural networks for point cloud data. Source code and datasets are available on https://github.com/plus-rkwitt/flooder.

The Flood Complex: Large-Scale Persistent Homology on Millions of Points

TL;DR

Abstract

The Flood Complex: Large-Scale Persistent Homology on Millions of Points

TL;DR

Abstract

Paper Structure

Table of Contents

Key Result

Figures (5)

Theorems & Definitions (7)