Mixup Barcodes: Quantifying Geometric-Topological Interactions between Point Clouds
Hubert Wagner, Nickolas Arustamyan, Matthew Wheeler, Peter Bubenik
TL;DR
The paper introduces mixup barcodes, a new topological descriptor that couples standard persistence with image persistence to quantify interactions between point clouds. By computing a coordinated mixup decomposition from filtrations of L and K, the authors provide a practical algorithm and software to extract total mixup and mixup percentage as scale-invariant metrics. Applied to embeddings from neural network training, the method reveals meaningful geometric-topological interactions that correlate with disentanglement across layers and data difficulty, offering advantages over conventional persistence in capturing inter-class structure. This work expands topological data analysis toward interaction-aware descriptors and suggests broad applicability in science and engineering domains where spatial relations between components matter. Specifically, it aligns with the Chromatic TDA direction and demonstrates potential for diagnosing training dynamics and guiding regularization in high-dimensional settings.
Abstract
We combine standard persistent homology with image persistent homology to define a novel way of characterizing shapes and interactions between them. In particular, we introduce: (1) a mixup barcode, which captures geometric-topological interactions (mixup) between two point sets in arbitrary dimension; (2) simple summary statistics, total mixup and total percentage mixup, which quantify the complexity of the interactions as a single number; (3) a software tool for playing with the above. As a proof of concept, we apply this tool to a problem arising from machine learning. In particular, we study the disentanglement in embeddings of different classes. The results suggest that topological mixup is a useful method for characterizing interactions for low and high-dimensional data. Compared to the typical usage of persistent homology, the new tool is sensitive to the geometric locations of the topological features, which is often desirable.
