Table of Contents
Fetching ...

CoRLD: Contrastive Representation Learning Of Deformable Shapes In Images

Tonmoy Hossain, Miaomiao Zhang

TL;DR

CoRLD tackles the reliance on template images in deformable shape learning and the difficulty of capturing voxel-level differences by learning class-aware contrastive shape features in a latent deformation space, with template guidance only in the loss during training. It decouples template input from inference and fuses learned geometric features with image features in a boosted classifier to improve medical image classification. The authors demonstrate state-of-the-art performance on 2D brain MRIs and 3D adrenal CT shapes across multiple backbones, and show robustness to perturbations and benefits from multi-template training. The work offers a flexible, template-free approach to deformable-shape representation learning with practical implications for medical imaging.

Abstract

Deformable shape representations, parameterized by deformations relative to a given template, have proven effective for improved image analysis tasks. However, their broader applicability is hindered by two major challenges. First, existing methods mainly rely on a known template during testing, which is impractical and limits flexibility. Second, they often struggle to capture fine-grained, voxel-level distinctions between similar shapes (e.g., anatomical variations among healthy individuals, those with mild cognitive impairment, and diseased states). To address these limitations, we propose a novel framework - Contrastive Representation Learning of Deformable shapes (CoRLD) in learned deformation spaces and demonstrate its effectiveness in the context of image classification. Our CoRLD leverages a class-aware contrastive supervised learning objective in latent deformation spaces, promoting proximity among representations of similar classes while ensuring separation of dissimilar groups. In contrast to previous deep learning networks that require a reference image as input to predict deformation changes, our approach eliminates this dependency. Instead, template images are utilized solely as ground truth in the loss function during the training process, making our model more flexible and generalizable to a wide range of medical applications. We validate CoRLD on diverse datasets, including real brain magnetic resonance imaging (MRIs) and adrenal shapes derived from computed tomography (CT) scans. Experimental results show that our model effectively extracts deformable shape features, which can be easily integrated with existing classifiers to substantially boost the classification accuracy. Our code is available at GitHub.

CoRLD: Contrastive Representation Learning Of Deformable Shapes In Images

TL;DR

CoRLD tackles the reliance on template images in deformable shape learning and the difficulty of capturing voxel-level differences by learning class-aware contrastive shape features in a latent deformation space, with template guidance only in the loss during training. It decouples template input from inference and fuses learned geometric features with image features in a boosted classifier to improve medical image classification. The authors demonstrate state-of-the-art performance on 2D brain MRIs and 3D adrenal CT shapes across multiple backbones, and show robustness to perturbations and benefits from multi-template training. The work offers a flexible, template-free approach to deformable-shape representation learning with practical implications for medical imaging.

Abstract

Deformable shape representations, parameterized by deformations relative to a given template, have proven effective for improved image analysis tasks. However, their broader applicability is hindered by two major challenges. First, existing methods mainly rely on a known template during testing, which is impractical and limits flexibility. Second, they often struggle to capture fine-grained, voxel-level distinctions between similar shapes (e.g., anatomical variations among healthy individuals, those with mild cognitive impairment, and diseased states). To address these limitations, we propose a novel framework - Contrastive Representation Learning of Deformable shapes (CoRLD) in learned deformation spaces and demonstrate its effectiveness in the context of image classification. Our CoRLD leverages a class-aware contrastive supervised learning objective in latent deformation spaces, promoting proximity among representations of similar classes while ensuring separation of dissimilar groups. In contrast to previous deep learning networks that require a reference image as input to predict deformation changes, our approach eliminates this dependency. Instead, template images are utilized solely as ground truth in the loss function during the training process, making our model more flexible and generalizable to a wide range of medical applications. We validate CoRLD on diverse datasets, including real brain magnetic resonance imaging (MRIs) and adrenal shapes derived from computed tomography (CT) scans. Experimental results show that our model effectively extracts deformable shape features, which can be easily integrated with existing classifiers to substantially boost the classification accuracy. Our code is available at GitHub.

Paper Structure

This paper contains 11 sections, 7 equations, 5 figures, 3 tables, 1 algorithm.

Figures (5)

  • Figure 1: An overview of our proposed model CoRLD.
  • Figure 2: Left to Right: Examples of brain MRI slices across four diagnostic groups (CN, EMCI, LMCI, AD) vs 3D adrenal shapes derived from CTs visualized in three anatomical planes (Axial, Coronal, and Sagittal).
  • Figure 3: Left to right: Visual comparison of the deformed template, its error map with the target, velocity in colormap, and deformation field for template-guided (w/ Temp) and CoRLD (w/o Temp) models.
  • Figure 4: Classification accuracy comparison across all models, including CoRLD, under ResNet and DenseNet backbones for 2D brain MRI (left panel) and 3D adrenal shape (right panel) datasets at different scales of adversarial noise levels.
  • Figure 5: Effect of the temperature parameter ($\tau$) on network performance for the 2D brain dataset.