Geometric Localization of Homology Cycles
Amritendu Dhar, Vijay Natarajan, Abhishek Rathod
TL;DR
This work tackles the NP-hard problem of localizing homology classes by introducing the geometry-aware $\ell_2$-radius objective, which measures the radius of the smallest sphere enclosing a cycle. The authors develop polynomial-time approximation algorithms for localizing a cycle within a given homology class, computing minimum homology bases, and constructing minimum persistent homology bases, along with a notion of approximate stability under persistence. A central result is that optimal persistent bases can be achieved by selecting minimal radiants per bar, with practical, scalable algorithms that run in $O(|P|N^3\log N)$ time. Experimental results on diverse datasets show the proposed cycles are tight and of high quality, often outperforming state-of-the-art methods such as PersLoop, and demonstrate the practical viability of geometry-aware homology localization in TDA.
Abstract
Computing an optimal cycle in a given homology class, also referred to as the homology localization problem, is known to be an NP-hard problem in general. Furthermore, there is currently no known optimality criterion that localizes classes geometrically and admits a stability property under the setting of persistent homology. We present a geometric optimization of the cycles that is computable in polynomial time and is stable in an approximate sense. Tailoring our search criterion to different settings, we obtain various optimization problems like optimal homologous cycle, minimum homology basis, and minimum persistent homology basis. In practice, the (trivial) exact algorithm is computationally expensive despite having a worst case polynomial runtime. Therefore, we design approximation algorithms for the above problems and study their performance experimentally. These algorithms have reasonable runtimes for moderate sized datasets and the cycles computed by these algorithms are consistently of high quality as demonstrated via experiments on multiple datasets.
