Table of Contents
Fetching ...

Brain Tumor Classification from 3D MRI Using Persistent Homology and Betti Features: A Topological Data Analysis Approach on BraTS2020

Faisal Ahmed

Abstract

Accurate and interpretable brain tumor classification from medical imaging remains a challenging problem due to the high dimensionality and complex structural patterns present in magnetic resonance imaging (MRI). In this study, we propose a topology-driven framework for brain tumor classification based on Topological Data Analysis (TDA) applied directly to three-dimensional (3D) MRI volumes. Specifically, we analyze 3D Fluid Attenuated Inversion Recovery (FLAIR) images from the BraTS 2020 dataset and extract interpretable topological descriptors using persistent homology. Persistent homology captures intrinsic geometric and structural characteristics of the data through Betti numbers, which describe connected components (Betti-0), loops (Betti-1), and voids (Betti-2). From the 3D MRI volumes, we derive a compact set of 100 topological features that summarize the underlying topology of brain tumor structures. These descriptors represent complex 3D tumor morphology while significantly reducing data dimensionality. Unlike many deep learning approaches that require large-scale training data or complex architectures, the proposed framework relies on computationally efficient topological features extracted directly from the images. These features are used to train classical machine learning classifiers, including Random Forest and XGBoost, for binary classification of high-grade glioma (HGG) and low-grade glioma (LGG). Experimental results on the BraTS 2020 dataset show that the Random Forest classifier combined with selected Betti features achieves an accuracy of 89.19%. These findings highlight the potential of persistent homology as an effective and interpretable approach for analyzing complex 3D medical images and performing brain tumor classification.

Brain Tumor Classification from 3D MRI Using Persistent Homology and Betti Features: A Topological Data Analysis Approach on BraTS2020

Abstract

Accurate and interpretable brain tumor classification from medical imaging remains a challenging problem due to the high dimensionality and complex structural patterns present in magnetic resonance imaging (MRI). In this study, we propose a topology-driven framework for brain tumor classification based on Topological Data Analysis (TDA) applied directly to three-dimensional (3D) MRI volumes. Specifically, we analyze 3D Fluid Attenuated Inversion Recovery (FLAIR) images from the BraTS 2020 dataset and extract interpretable topological descriptors using persistent homology. Persistent homology captures intrinsic geometric and structural characteristics of the data through Betti numbers, which describe connected components (Betti-0), loops (Betti-1), and voids (Betti-2). From the 3D MRI volumes, we derive a compact set of 100 topological features that summarize the underlying topology of brain tumor structures. These descriptors represent complex 3D tumor morphology while significantly reducing data dimensionality. Unlike many deep learning approaches that require large-scale training data or complex architectures, the proposed framework relies on computationally efficient topological features extracted directly from the images. These features are used to train classical machine learning classifiers, including Random Forest and XGBoost, for binary classification of high-grade glioma (HGG) and low-grade glioma (LGG). Experimental results on the BraTS 2020 dataset show that the Random Forest classifier combined with selected Betti features achieves an accuracy of 89.19%. These findings highlight the potential of persistent homology as an effective and interpretable approach for analyzing complex 3D medical images and performing brain tumor classification.
Paper Structure (18 sections, 24 equations, 7 figures, 3 tables, 1 algorithm)

This paper contains 18 sections, 24 equations, 7 figures, 3 tables, 1 algorithm.

Figures (7)

  • Figure 1: Visualization of topological feature distributions using Principal Component Analysis (PCA). The plots show the first two principal components of Betti-based feature vectors derived from BraTS2020 MRI volumes: (a) Betti-0 features representing connected components, (b) Betti-1 features corresponding to loops or tunnels, and (c) Betti-2 features representing cavities. The separation between low-grade glioma (LGG) and high-grade glioma (HGG) samples demonstrates the discriminative capability of persistent homology-based topological descriptors for brain tumor classification.
  • Figure 2: Representative 2D slices from 3D FLAIR MRI volumes in the BraTS2020 dataset. The first row shows examples of low-grade glioma (LGG) cases, while the second row shows examples of high-grade glioma (HGG) cases. These images illustrate the structural differences in tumor appearance between LGG and HGG samples used for topological feature extraction.
  • Figure 3: Visualization of Betti functions extracted from 3D MRI volumes in the BraTS2020 dataset. Each plot shows the median Betti curve along with $40\%$ confidence bands for the two tumor classes: low-grade glioma (LGG) and high-grade glioma (HGG). The $x$-axis represents normalized grayscale filtration thresholds, while the $y$-axis represents the number of topological features at each threshold: connected components for Betti-0, loops for Betti-1, and cavities for Betti-2. These curves illustrate structural differences between LGG and HGG tumors captured by persistent homology.
  • Figure 4: Cubical sublevel filtration of an MRI slice. A grayscale 3D MRI is progressively thresholded to generate a sequence of binary images. The images $\mathcal{X}_{60}$, $\mathcal{X}_{80}$, and $\mathcal{X}_{120}$ correspond to threshold values of $60$, $80$, and $120$, respectively. As the threshold increases, additional voxels become activated, producing a nested sequence of cubical complexes. During this filtration process, topological structures such as connected components ($k=0$), loops or tunnels ($k=1$), and cavities ($k=2$) appear and evolve, which are subsequently captured by persistent homology.
  • Figure 5: Overview of the proposed TDA-based classification framework. The pipeline begins with a 3D MRI volume composed of multiple 2D slices. Persistent homology is then applied using cubical filtration to compute persistence diagrams for homology dimensions $k=0,1,$ and $2$, capturing connected components, loops, and cavities within the image structure. These diagrams are subsequently transformed into Betti curves, which describe the evolution of topological features across the filtration scale. The resulting Betti-0, Betti-1, and Betti-2 curves are converted into fixed-length feature vectors and concatenated to form a unified topological representation for each MRI volume. Feature selection is then performed to identify the most informative descriptors, and the selected features are finally used as input to machine learning classifiers to distinguish between low-grade glioma (LGG) and high-grade glioma (HGG) in the BraTS2020 dataset.
  • ...and 2 more figures