Table of Contents
Fetching ...

Metadata-Aligned 3D MRI Representations for Contrast Understanding and Quality Control

Mehmet Yigit Avci, Pedro Borges, Virginia Fernandez, Paul Wright, Mehmet Yigitsoy, Sebastien Ourselin, Jorge Cardoso

TL;DR

MRI datasets suffer from substantial heterogeneity and lack standardized contrast labels across scanners, hindering large-scale automated analysis. The authors propose MR-CLIP, a metadata-guided, contrastive learning framework that aligns 3D MRI volumes with DICOM metadata via a 3D image encoder and a metadata encoder, forming acquisition clusters that reflect true contrast differences. The approach yields 3D contrast representations that enable few-shot sequence classification and unsupervised quality control through image–metadata embedding distances, demonstrating strong clustering and transferability. This work provides a scalable, label-efficient foundation for organizing, classifying, and quality-controlling diverse MRI repositories across clinical sites.

Abstract

Magnetic Resonance Imaging suffers from substantial data heterogeneity and the absence of standardized contrast labels across scanners, protocols, and institutions, which severely limits large-scale automated analysis. A unified representation of MRI contrast would enable a wide range of downstream utilities, from automatic sequence recognition to harmonization and quality control, without relying on manual annotations. To this end, we introduce MR-CLIP, a metadata-guided framework that learns MRI contrast representations by aligning volumetric images with their DICOM acquisition parameters. The resulting embeddings shows distinct clusters of MRI sequences and outperform supervised 3D baselines under data scarcity in few-shot sequence classification. Moreover, MR-CLIP enables unsupervised data quality control by identifying corrupted or inconsistent metadata through image-metadata embedding distances. By transforming routinely available acquisition metadata into a supervisory signal, MR-CLIP provides a scalable foundation for label-efficient MRI analysis across diverse clinical datasets.

Metadata-Aligned 3D MRI Representations for Contrast Understanding and Quality Control

TL;DR

MRI datasets suffer from substantial heterogeneity and lack standardized contrast labels across scanners, hindering large-scale automated analysis. The authors propose MR-CLIP, a metadata-guided, contrastive learning framework that aligns 3D MRI volumes with DICOM metadata via a 3D image encoder and a metadata encoder, forming acquisition clusters that reflect true contrast differences. The approach yields 3D contrast representations that enable few-shot sequence classification and unsupervised quality control through image–metadata embedding distances, demonstrating strong clustering and transferability. This work provides a scalable, label-efficient foundation for organizing, classifying, and quality-controlling diverse MRI repositories across clinical sites.

Abstract

Magnetic Resonance Imaging suffers from substantial data heterogeneity and the absence of standardized contrast labels across scanners, protocols, and institutions, which severely limits large-scale automated analysis. A unified representation of MRI contrast would enable a wide range of downstream utilities, from automatic sequence recognition to harmonization and quality control, without relying on manual annotations. To this end, we introduce MR-CLIP, a metadata-guided framework that learns MRI contrast representations by aligning volumetric images with their DICOM acquisition parameters. The resulting embeddings shows distinct clusters of MRI sequences and outperform supervised 3D baselines under data scarcity in few-shot sequence classification. Moreover, MR-CLIP enables unsupervised data quality control by identifying corrupted or inconsistent metadata through image-metadata embedding distances. By transforming routinely available acquisition metadata into a supervisory signal, MR-CLIP provides a scalable foundation for label-efficient MRI analysis across diverse clinical datasets.

Paper Structure

This paper contains 7 sections, 2 equations, 5 figures, 1 table.

Figures (5)

  • Figure 1: MR-CLIP aligns MRI volumes with their corresponding DICOM metadata through contrastive learning. A 3D image encoder and a metadata encoder jointly learn to associate similar acquisitions while distinguishing different contrasts, resulting in contrast-aware representations that are robust to anatomical and subtle acquisition variability.
  • Figure 2: Error rates across DICOM tags based on linear probe classification results.
  • Figure 3: t-SNE visualizations of image and metadata embeddings, color coded by sequence.
  • Figure 4: Few-shot learning performance of linear classifier trained on image embeddings of MR-CLIP, compared to supervised 3D ResNet baseline.
  • Figure 5: Evaluation of quality control, showing the degradation of average cosine similarity with increasing metadata error rate (A) and the AUC performance (B) across three error types and severity levels with 50% error rate.