Table of Contents
Fetching ...

Effective Feature Learning for 3D Medical Registration via Domain-Specialized DINO Pretraining

Eytan Kats, Mattias P. Heinrich

Abstract

Medical image registration is a critical component of clinical imaging workflows, enabling accurate longitudinal assessment, multi-modal data fusion, and image-guided interventions. Intensity-based approaches often struggle with interscanner variability and complex anatomical deformations, whereas feature-based methods offer improved robustness by leveraging semantically informed representations. In this work, we investigate DINO-style self-supervised pretraining directly on 3D medical imaging data, aiming to learn dense volumetric features well suited for deformable registration. We assess the resulting representations on challenging interpatient abdominal registration task across both MRI and CT modalities. Our domain-specialized pretraining outperforms the DINOv2 model trained on a large-scale collection of natural images, while requiring substantially lower computational resources at inference time. Moreover, it surpasses established registration models under out-of-domain evaluation, demonstrating the value of task-agnostic yet medical imaging-focused pretraining for robust and efficient 3D image registration.

Effective Feature Learning for 3D Medical Registration via Domain-Specialized DINO Pretraining

Abstract

Medical image registration is a critical component of clinical imaging workflows, enabling accurate longitudinal assessment, multi-modal data fusion, and image-guided interventions. Intensity-based approaches often struggle with interscanner variability and complex anatomical deformations, whereas feature-based methods offer improved robustness by leveraging semantically informed representations. In this work, we investigate DINO-style self-supervised pretraining directly on 3D medical imaging data, aiming to learn dense volumetric features well suited for deformable registration. We assess the resulting representations on challenging interpatient abdominal registration task across both MRI and CT modalities. Our domain-specialized pretraining outperforms the DINOv2 model trained on a large-scale collection of natural images, while requiring substantially lower computational resources at inference time. Moreover, it surpasses established registration models under out-of-domain evaluation, demonstrating the value of task-agnostic yet medical imaging-focused pretraining for robust and efficient 3D image registration.
Paper Structure (11 sections, 2 figures, 2 tables)

This paper contains 11 sections, 2 figures, 2 tables.

Figures (2)

  • Figure 1: Zero-shot registration framework. Features from the moving and fixed images are first extracted using the pretrained transformer and then projected into a low-dimensional space using PCA. ConvexAdam leverages these semantically rich representations to estimate an accurate displacement field.
  • Figure 2: Comparison of registration performance across different feature extractors. The results highlight the advantage of domain-specific DINO-style pretraining over DINOv2 and MIND features.