Table of Contents
Fetching ...

Topological Point Cloud Clustering

Vincent P. Grande, Michael T. Schaub

TL;DR

Topological Point Cloud Clustering is presented, a new method to cluster points in an arbitrary point cloud based on their contribution to global topological features and is based on considering the spectral properties of a simplicial complex associated to the considered point cloud.

Abstract

We present Topological Point Cloud Clustering (TPCC), a new method to cluster points in an arbitrary point cloud based on their contribution to global topological features. TPCC synthesizes desirable features from spectral clustering and topological data analysis and is based on considering the spectral properties of a simplicial complex associated to the considered point cloud. As it is based on considering sparse eigenvector computations, TPCC is similarly easy to interpret and implement as spectral clustering. However, by focusing not just on a single matrix associated to a graph created from the point cloud data, but on a whole set of Hodge-Laplacians associated to an appropriately constructed simplicial complex, we can leverage a far richer set of topological features to characterize the data points within the point cloud and benefit from the relative robustness of topological techniques against noise. We test the performance of TPCC on both synthetic and real-world data and compare it with classical spectral clustering.

Topological Point Cloud Clustering

TL;DR

Topological Point Cloud Clustering is presented, a new method to cluster points in an arbitrary point cloud based on their contribution to global topological features and is based on considering the spectral properties of a simplicial complex associated to the considered point cloud.

Abstract

We present Topological Point Cloud Clustering (TPCC), a new method to cluster points in an arbitrary point cloud based on their contribution to global topological features. TPCC synthesizes desirable features from spectral clustering and topological data analysis and is based on considering the spectral properties of a simplicial complex associated to the considered point cloud. As it is based on considering sparse eigenvector computations, TPCC is similarly easy to interpret and implement as spectral clustering. However, by focusing not just on a single matrix associated to a graph created from the point cloud data, but on a whole set of Hodge-Laplacians associated to an appropriately constructed simplicial complex, we can leverage a far richer set of topological features to characterize the data points within the point cloud and benefit from the relative robustness of topological techniques against noise. We test the performance of TPCC on both synthetic and real-world data and compare it with classical spectral clustering.
Paper Structure (38 sections, 4 theorems, 7 equations, 10 figures, 2 tables, 1 algorithm)

This paper contains 38 sections, 4 theorems, 7 equations, 10 figures, 2 tables, 1 algorithm.

Key Result

Lemma 2.4

For a simplicial complex $\mathcal{S}$ with boundary matrices $\mathcal{B}_i$ we have that ${\mathcal{B}_n\circ\mathcal{B}_{n+1}=0}$ for $n\ge 0$.

Figures (10)

  • Figure 1: Above we depict the heatmaps for all $16$ distinct combinations of topological features encoded in the topological signature across $3$ dimensions of our toy example. Note that some of the features are redundant, as both edges and faces can measure membership of a torus.
  • Figure 2: The final clustering obtained with TPCC. There are $10$ clusters in total. Two clusters identify the two tori (turquoise and ochre), two disconnected cubes (red and lime), dark blue and salmon for the connecting lines of the tori to the middle, azure for the middle line, yellow for the intersection of the lines, and fuchsia and brown for the gluing points of the points to the tori. Note that there are virtually no outliers.
  • Figure 3: The circle is divided into two parts by a vertical line. This gives the corresponding SC two generating loops in dimension $1$, corresponding to a $2$-dimensional $0$-eigenspace of the Hodge Laplacian $L_1$ and a $2$-dimensional 1st feature space $\mathcal{X}_1$. However, now there are three linear subspaces corresponding to linear combinations of the two generating loops. TPCC is able to detect three different clusters of topologically significant edges.
  • Figure 4: TPCC is the only approach correctly distinguishing the spheres and circles.
  • Figure 5: Left: Energy landscape of cyclo-octane clustered by topological point cloud clustering. We have four different clusters, with the green one being the anomalous points. Right: Clustering of the Henneberg surface.
  • ...and 5 more figures

Theorems & Definitions (8)

  • Definition 2.1: Abstract simplicial complex
  • Definition 2.2
  • Definition 2.3
  • Lemma 2.4
  • Lemma 2.5: Eckmann:1944Friedman:1998
  • Definition 3.1: Topological Signature
  • Theorem 4.1
  • Theorem