Adversarially Robust Topological Inference
Siddharth Vishwanath, Bharath K. Sriperumbudur, Kenji Fukumizu, Satoshi Kuriki
TL;DR
This work tackles the vulnerability of persistent homology to outliers by proposing a robust, scalable framework based on a median-of-means distance (MoM Dist). It introduces MoM-dist-based weighted filtrations, proves consistency and near-minimax rates for sublevel persistence diagrams under adversarial contamination, and establishes stability for the corresponding filtrations. An adaptive Lepski-based procedure selects the tuning parameter Q without sacrificing guarantees, and influence analysis shows MoM-based methods reduce outlier impact. Comprehensive experiments on synthetic and real data demonstrate robust signal recovery and superior performance compared with existing methods, highlighting practical applicability to high-dimensional topological inference.
Abstract
The distance function to a compact set plays a crucial role in the paradigm of topological data analysis. In particular, the sublevel sets of the distance function are used in the computation of persistent homology -- a backbone of the topological data analysis pipeline. Despite its stability to perturbations in the Hausdorff distance, persistent homology is highly sensitive to outliers. In this work, we develop a framework of statistical inference for persistent homology in the presence of outliers. Drawing inspiration from recent developments in robust statistics, we propose a \textit{median-of-means} variant of the distance function (\textsf{MoM Dist}) and establish its statistical properties. In particular, we show that, even in the presence of outliers, the sublevel filtrations and weighted filtrations induced by \textsf{MoM Dist} are both consistent estimators of the true underlying population counterpart and exhibit near minimax-optimal performance in adversarial settings. Finally, we demonstrate the advantages of the proposed methodology through simulations and applications.
