Integrating Multi-scale and Multi-filtration Topological Features for Medical Image Classification
Pengfei Gu, Huimin Li, Haoteng Tang, Dongkuan, Xu, Erik Enriquez, DongChul Kim, Bin Fu, Danny Z. Chen
TL;DR
The paper addresses the underutilization of topological information in medical image classification by introducing a topology-guided framework that extracts stable multi-scale, multi-filtration persistent diagrams. It consolidates these diagrams with a vineyard algorithm, encodes them via a cross-attention-based PD encoder, and fuses the topology embeddings into CNN/Transformer backbones in an end-to-end pipeline. Across ISIC 2018, Kvasir, and CBIS-DDSM, the approach yields consistent improvements over strong baselines and SOTA methods, with ablations validating the contributions of scale stabilization and multi-filtration fusion. This topology-centric augmentation enhances robustness and interpretability in medical image classification while remaining model-agnostic for integration with various vision architectures.
Abstract
Modern deep neural networks have shown remarkable performance in medical image classification. However, such networks either emphasize pixel-intensity features instead of fundamental anatomical structures (e.g., those encoded by topological invariants), or they capture only simple topological features via single-parameter persistence. In this paper, we propose a new topology-guided classification framework that extracts multi-scale and multi-filtration persistent topological features and integrates them into vision classification backbones. For an input image, we first compute cubical persistence diagrams (PDs) across multiple image resolutions/scales. We then develop a ``vineyard'' algorithm that consolidates these PDs into a single, stable diagram capturing signatures at varying granularities, from global anatomy to subtle local irregularities that may indicate early-stage disease. To further exploit richer topological representations produced by multiple filtrations, we design a cross-attention-based neural network that directly processes the consolidated final PDs. The resulting topological embeddings are fused with feature maps from CNNs or Transformers. By integrating multi-scale and multi-filtration topologies into an end-to-end architecture, our approach enhances the model's capacity to recognize complex anatomical structures. Evaluations on three public datasets show consistent, considerable improvements over strong baselines and state-of-the-art methods, demonstrating the value of our comprehensive topological perspective for robust and interpretable medical image classification.
