CoRLD: Contrastive Representation Learning Of Deformable Shapes In Images
Tonmoy Hossain, Miaomiao Zhang
TL;DR
CoRLD tackles the reliance on template images in deformable shape learning and the difficulty of capturing voxel-level differences by learning class-aware contrastive shape features in a latent deformation space, with template guidance only in the loss during training. It decouples template input from inference and fuses learned geometric features with image features in a boosted classifier to improve medical image classification. The authors demonstrate state-of-the-art performance on 2D brain MRIs and 3D adrenal CT shapes across multiple backbones, and show robustness to perturbations and benefits from multi-template training. The work offers a flexible, template-free approach to deformable-shape representation learning with practical implications for medical imaging.
Abstract
Deformable shape representations, parameterized by deformations relative to a given template, have proven effective for improved image analysis tasks. However, their broader applicability is hindered by two major challenges. First, existing methods mainly rely on a known template during testing, which is impractical and limits flexibility. Second, they often struggle to capture fine-grained, voxel-level distinctions between similar shapes (e.g., anatomical variations among healthy individuals, those with mild cognitive impairment, and diseased states). To address these limitations, we propose a novel framework - Contrastive Representation Learning of Deformable shapes (CoRLD) in learned deformation spaces and demonstrate its effectiveness in the context of image classification. Our CoRLD leverages a class-aware contrastive supervised learning objective in latent deformation spaces, promoting proximity among representations of similar classes while ensuring separation of dissimilar groups. In contrast to previous deep learning networks that require a reference image as input to predict deformation changes, our approach eliminates this dependency. Instead, template images are utilized solely as ground truth in the loss function during the training process, making our model more flexible and generalizable to a wide range of medical applications. We validate CoRLD on diverse datasets, including real brain magnetic resonance imaging (MRIs) and adrenal shapes derived from computed tomography (CT) scans. Experimental results show that our model effectively extracts deformable shape features, which can be easily integrated with existing classifiers to substantially boost the classification accuracy. Our code is available at GitHub.
