Fast Witness Persistence for MRI Volumes via Hybrid Landmarking
Jorge Leonardo Ruiz Williams
TL;DR
This work tackles the scalability bottleneck of exact filtrations in persistent homology for full MRI volumes by proposing a hybrid landmarking strategy coupled with a GPU-accelerated lazy witness filtration. The method converts MRI data into a unit-cube point cloud with intensity weights, selects a compact landmark set using a density-aware MaxMin approach, and builds a lazy witness complex with a $k$-nearest-witness constraint, enabling efficient topological summaries while preserving $H_1$ and higher features where relevant. Key contributions include the hybrid landmarking scheme with an automatic scaling rule, a coverage-aware diagnostic suite, a modular Python distribution (whale-tda) with CLI presets for fast sweeps, and a comprehensive benchmark suite across BrainWeb, IXI, and synthetic datasets demonstrating improved coverage and scalability. The approach offers practical impact for rapid topology-aware analysis in medical imaging and beyond by avoiding the combinatorial blow-up of classical filtrations while maintaining fidelity to salient topological signatures, as evidenced by sub-second to tens-of-seconds runtimes and robust agreement with reference filtrations.
Abstract
We introduce a scalable witness-based persistent homology pipeline for full-brain MRI volumes that couples density-aware landmark selection with a GPU-ready witness filtration. Candidates are scored by a hybrid metric that balances geometric coverage against inverse kernel density, yielding landmark sets that shrink mean pairwise distances by 30-60% over random or density-only baselines while preserving topological features. Benchmarks on BrainWeb, IXI, and synthetic manifolds execute in under ten seconds on a single NVIDIA RTX 4090 GPU, avoiding the combinatorial blow-up of Cech, Vietoris-Rips, and alpha filtrations. The package is distributed on PyPI as whale-tda (installable via pip); source and issues are hosted at https://github.com/jorgeLRW/whale. The release also exposes a fast preset (mri_deep_dive_fast) for exploratory sweeps, and ships with reproducibility-focused scripts and artifacts for drop-in use in medical imaging workflows.
