Table of Contents
Fetching ...

ICHOR: A Robust Representation Learning Approach for ASL CBF Maps with Self-Supervised Masked Autoencoders

Xavier Beltran-Urbano, Yiran Li, Xinglin Zeng, Katie R. Jobson, Manuel Taso, Christopher A. Brown, David A. Wolk, Corey T. McMillan, Ilya M. Nashrallah, Paul A. Yushkevich, Ze Wang, John A. Detre, Sudipto Dolui

TL;DR

ICHOR is a self supervised pre-training approach for ASL CBF maps that learns transferable representations using 3D masked autoencoders that outperformed existing neuroimaging self-supervised pre-training methods adapted to ASL.

Abstract

Arterial spin labeling (ASL) perfusion MRI allows direct quantification of regional cerebral blood flow (CBF) without exogenous contrast, enabling noninvasive measurements that can be repeated without constraints imposed by contrast injection. ASL is increasingly acquired in research studies and clinical MRI protocols. Building on successes in structural imaging, recent efforts have implemented deep learning based methods to improve image quality, enable automated quality control, and derive robust quantitative and predictive biomarkers with ASL derived CBF. However, progress has been limited by variable image quality, substantial inter-site, vendor and protocol differences, and limited availability of labeled datasets needed to train models that generalize across cohorts. To address these challenges, we introduce ICHOR, a self supervised pre-training approach for ASL CBF maps that learns transferable representations using 3D masked autoencoders. ICHOR is pretrained via masked image modeling using a Vision Transformer backbone and can be used as a general-purpose encoder for downstream ASL tasks. For pre-training, we curated one of the largest ASL datasets to date, comprising 11,405 ASL CBF scans from 14 studies spanning multiple sites and acquisition protocols. We evaluated the pre-trained ICHOR encoder on three downstream diagnostic classification tasks and one ASL CBF map quality prediction regression task. Across all evaluations, ICHOR outperformed existing neuroimaging self-supervised pre-training methods adapted to ASL. Pre-trained weights and code will be made publicly available.

ICHOR: A Robust Representation Learning Approach for ASL CBF Maps with Self-Supervised Masked Autoencoders

TL;DR

ICHOR is a self supervised pre-training approach for ASL CBF maps that learns transferable representations using 3D masked autoencoders that outperformed existing neuroimaging self-supervised pre-training methods adapted to ASL.

Abstract

Arterial spin labeling (ASL) perfusion MRI allows direct quantification of regional cerebral blood flow (CBF) without exogenous contrast, enabling noninvasive measurements that can be repeated without constraints imposed by contrast injection. ASL is increasingly acquired in research studies and clinical MRI protocols. Building on successes in structural imaging, recent efforts have implemented deep learning based methods to improve image quality, enable automated quality control, and derive robust quantitative and predictive biomarkers with ASL derived CBF. However, progress has been limited by variable image quality, substantial inter-site, vendor and protocol differences, and limited availability of labeled datasets needed to train models that generalize across cohorts. To address these challenges, we introduce ICHOR, a self supervised pre-training approach for ASL CBF maps that learns transferable representations using 3D masked autoencoders. ICHOR is pretrained via masked image modeling using a Vision Transformer backbone and can be used as a general-purpose encoder for downstream ASL tasks. For pre-training, we curated one of the largest ASL datasets to date, comprising 11,405 ASL CBF scans from 14 studies spanning multiple sites and acquisition protocols. We evaluated the pre-trained ICHOR encoder on three downstream diagnostic classification tasks and one ASL CBF map quality prediction regression task. Across all evaluations, ICHOR outperformed existing neuroimaging self-supervised pre-training methods adapted to ASL. Pre-trained weights and code will be made publicly available.
Paper Structure (12 sections, 1 equation, 4 figures, 7 tables)

This paper contains 12 sections, 1 equation, 4 figures, 7 tables.

Figures (4)

  • Figure 1: Overview of the presented study.
  • Figure 2: Schematic of the ICHOR framework. Stage 1 (top) represents the proposed self-supervised 3D MAE pre-training pipeline. Stage 2 (bottom) illustrates the downstream task adaptation.
  • Figure 3: Nested 5-fold cross-validation strategy used for evaluation.
  • Figure 4: Reconstruction results for pre-trained models trained with different masking ratios across imaging protocols and scanner vendors.