Table of Contents
Fetching ...

TopoCL: Topological Contrastive Learning for Medical Imaging

Guangyu Meng, Pengfei Gu, Peixian Liang, John P. Lalor, Erin Wolf Chambers, Danny Z. Chen

Abstract

Contrastive learning (CL) has become a powerful approach for learning representations from unlabeled images. However, existing CL methods focus predominantly on visual appearance features while neglecting topological characteristics (e.g., connectivity patterns, boundary configurations, cavity formations) that provide valuable cues for medical image analysis. To address this limitation, we propose a new topological CL framework (TopoCL) that explicitly exploits topological structures during contrastive learning for medical imaging. Specifically, we first introduce topology-aware augmentations that control topological perturbations using a relative bottleneck distance between persistence diagrams, preserving medically relevant topological properties while enabling controlled structural variations. We then design a Hierarchical Topology Encoder that captures topological features through self-attention and cross-attention mechanisms. Finally, we develop an adaptive mixture-of-experts (MoE) module to dynamically integrate visual and topological representations. TopoCL can be seamlessly integrated with existing CL methods. We evaluate TopoCL on five representative CL methods (SimCLR, MoCo-v3, BYOL, DINO, and Barlow Twins) and five diverse medical image classification datasets. The experimental results show that TopoCL achieves consistent improvements: an average gain of +3.26% in linear probe classification accuracy with strong statistical significance, verifying its effectiveness.

TopoCL: Topological Contrastive Learning for Medical Imaging

Abstract

Contrastive learning (CL) has become a powerful approach for learning representations from unlabeled images. However, existing CL methods focus predominantly on visual appearance features while neglecting topological characteristics (e.g., connectivity patterns, boundary configurations, cavity formations) that provide valuable cues for medical image analysis. To address this limitation, we propose a new topological CL framework (TopoCL) that explicitly exploits topological structures during contrastive learning for medical imaging. Specifically, we first introduce topology-aware augmentations that control topological perturbations using a relative bottleneck distance between persistence diagrams, preserving medically relevant topological properties while enabling controlled structural variations. We then design a Hierarchical Topology Encoder that captures topological features through self-attention and cross-attention mechanisms. Finally, we develop an adaptive mixture-of-experts (MoE) module to dynamically integrate visual and topological representations. TopoCL can be seamlessly integrated with existing CL methods. We evaluate TopoCL on five representative CL methods (SimCLR, MoCo-v3, BYOL, DINO, and Barlow Twins) and five diverse medical image classification datasets. The experimental results show that TopoCL achieves consistent improvements: an average gain of +3.26% in linear probe classification accuracy with strong statistical significance, verifying its effectiveness.
Paper Structure (14 sections, 3 equations, 3 figures, 7 tables)

This paper contains 14 sections, 3 equations, 3 figures, 7 tables.

Figures (3)

  • Figure 1: Visualization of failure cases corrected by TopoCL on ISIC2019 tschandl2018ham10000. Two dermatofibroma (DF) cases are misclassified by the baseline MoCo-v3 chen2021mocov3 as actinic keratosis (AK, top) and melanocytic nevi (NV, bottom). MoCo-v3+TopoCL correctly classifies both as DF by capturing characteristic topological features: circular-to-oval boundary patterns with uniform internal connectivity (top) and radial pigmentation structures with a consistent boundary configuration despite hair overlay (bottom). Grad-CAM selvaraju2017grad heatmaps show that TopoCL focuses on lesion boundaries and internal structural patterns, while the baseline exhibits scattered attention on peripheral or irrelevant regions.
  • Figure 2: An overview of our TopoCL framework. (a) Topological Contrastive Learning: Given an input image $x$, we generate two views with topology-weak ($x_{\text{topo}}^w$) and topology-strong ($x_{\text{topo}}^s$) augmentations, and compute their persistence diagrams $\text{topo}^w$ and $\text{topo}^s$. The visual encoder processes $x_{\text{topo}}^w$, $x_{\text{topo}}^s$ while the topology encoder processes $\text{topo}^w$, $\text{topo}^s$, with projection heads producing visual features $\mathbf{f}^w$, $\mathbf{f}^s$ and topological features $\mathbf{t}^w$, $\mathbf{t}^s$. The TopoCL MoE module fuses these features through five experts (the colors distinguish the fusion strategies: Vis-Only, Topo-Only, Concat, Gated, and Cross-Attn) with learned gating weights, producing fused representations $\mathbf{h}^w$, $\mathbf{h}^s$. A projection head maps these to the final representations $\mathbf{z}^w$, $\mathbf{z}^s$, optimized via contrastive loss. (b) The structure of H-Topo. Encoder: It uses hierarchical self- and cross-attention to capture topological features from PDs with both $H_0$ and $H_1$ homology dimensions.
  • Figure 3: Expert gating analysis for TopoCL integrated with BYOL on five datasets. Error bars show standard deviation across test samples. Five experts: visual-only (Vis), topology-only (Topo), concatenation (Conc), gated blending (Gate), and cross-attention (Attn).