Table of Contents
Fetching ...

Manifold Topological Deep Learning for Biomedical Data

Xiang Liu, Zhe Su, Yongyi Shi, Yiying Tong, Ge Wang, Guo-Wei Wei

TL;DR

MTDL tackles the challenge of applying topological deep learning to differentiable manifolds by representing images as discrete manifolds with vector fields and performing a topology-preserving Hodge decomposition. The key idea is to decompose a vector field on a manifold into curl-free, divergence-free, and harmonic components, concatenate these components into a multi-channel input, and process it with a Transformer-augmented CNN. On the MedMNIST v2 biomedical image benchmark, MTDL achieves superior AUC and ACC across 2D and 3D datasets while using a lightweight parameter count, and demonstrates robustness across modalities, scales, and task types. The work highlights the practical potential of integrating differential topology with deep learning for medical image analysis and points to future extensions such as richer decompositions and attention-based long-range inference.

Abstract

Recently, topological deep learning (TDL), which integrates algebraic topology with deep neural networks, has achieved tremendous success in processing point-cloud data, emerging as a promising paradigm in data science. However, TDL has not been developed for data on differentiable manifolds, including images, due to the challenges posed by differential topology. We address this challenge by introducing manifold topological deep learning (MTDL) for the first time. To highlight the power of Hodge theory rooted in differential topology, we consider a simple convolutional neural network (CNN) in MTDL. In this novel framework, original images are represented as smooth manifolds with vector fields that are decomposed into three orthogonal components based on Hodge theory. These components are then concatenated to form an input image for the CNN architecture. The performance of MTDL is evaluated using the MedMNIST v2 benchmark database, which comprises 717,287 biomedical images from eleven 2D and six 3D datasets. MTDL significantly outperforms other competing methods, extending TDL to a wide range of data on smooth manifolds.

Manifold Topological Deep Learning for Biomedical Data

TL;DR

MTDL tackles the challenge of applying topological deep learning to differentiable manifolds by representing images as discrete manifolds with vector fields and performing a topology-preserving Hodge decomposition. The key idea is to decompose a vector field on a manifold into curl-free, divergence-free, and harmonic components, concatenate these components into a multi-channel input, and process it with a Transformer-augmented CNN. On the MedMNIST v2 biomedical image benchmark, MTDL achieves superior AUC and ACC across 2D and 3D datasets while using a lightweight parameter count, and demonstrates robustness across modalities, scales, and task types. The work highlights the practical potential of integrating differential topology with deep learning for medical image analysis and points to future extensions such as richer decompositions and attention-based long-range inference.

Abstract

Recently, topological deep learning (TDL), which integrates algebraic topology with deep neural networks, has achieved tremendous success in processing point-cloud data, emerging as a promising paradigm in data science. However, TDL has not been developed for data on differentiable manifolds, including images, due to the challenges posed by differential topology. We address this challenge by introducing manifold topological deep learning (MTDL) for the first time. To highlight the power of Hodge theory rooted in differential topology, we consider a simple convolutional neural network (CNN) in MTDL. In this novel framework, original images are represented as smooth manifolds with vector fields that are decomposed into three orthogonal components based on Hodge theory. These components are then concatenated to form an input image for the CNN architecture. The performance of MTDL is evaluated using the MedMNIST v2 benchmark database, which comprises 717,287 biomedical images from eleven 2D and six 3D datasets. MTDL significantly outperforms other competing methods, extending TDL to a wide range of data on smooth manifolds.

Paper Structure

This paper contains 50 sections, 45 equations, 7 figures, 9 tables.

Figures (7)

  • Figure 1: Model architecture of MTDL. The original image is first modeled as a discrete manifold on Cartesian grids under specific boundary conditions (a). A vector field is then constructed on the manifold (b). Using the discrete Hodge Laplacian for manifolds with boundary (c), this vector field is decomposed into three orthogonal components: curl-free, divergence-free, and harmonic parts (d). These components are subsequently concatenated to form a multi-channel image, which serves as the input of CNN for the classification task (e).
  • Figure 2: Performance comparison between MTDL model and other models on the MedMNIST v2 dataset. a: Comparison of model performance in terms of AUC and ACC across all 17 datasets of MedMNIST v2. The polygon representing the MTDL model covers the largest area, indicating its superior performance compared to the other models. b: Average performance of all models over 2D and 3D tasks. MTDL consistently achieves higher AUC and ACC values, outperforming all other models for both types of tasks. c: Frequency of top-ranking performance across 2D and 3D tasks. MTDL significantly surpasses all other models, demonstrating its consistent superiority in both 2D and 3D tasks.
  • Figure 3: Performance comparison between MTDL and other models on different groups based on data modality, data scale, and task type. Here we only show the best six models for each group. a: Comparison on four data modality groups (Radiology, Microscopy, Ophthalmology, Dermatology). b: Comparison on four data scale groups ($n<10$K, $10{\rm K}\leqslant n<50{\rm K}$, $50{\rm K}\leqslant n<100{\rm K}$, $100{\rm K}<n$) where $n$ is the sample numbers of each dataset. c: Comparison on four task type groups ($n=2$, $2<n\leqslant5$, $5<n\leqslant10$, $10<n$) where $n$ is the class number of each dataset.
  • Figure 4: Illustration of the topology-preserving property of the discrete Hodge Laplacian and the Hodge decomposition for a medical image. In (a), the foreground of the image is represented as a manifold with boundary. The Laplacian $L_{1,n}$ is computed and its eigenvectors corresponding to the zero eigenvalues are displayed, accurately capturing the three loops in the manifold (b). In (c), a vector field (1-form) is constructed from the image and decomposed into three orthogonal components: the curl-free, divergence-free, and harmonic parts. The harmonic component encapsulates the global topological information, while the other two components convey distinct aspects of local information.
  • Figure 5: Illustration of the 3D Hodge decomposition on a pear with a tunnel model. From left to right: the original vector field, the curl-free field, the divergence-free field, the normal harmonic field, the tangential harmonic field, and the curl gradient field.
  • ...and 2 more figures