Table of Contents
Fetching ...

CoCoLIT: ControlNet-Conditioned Latent Image Translation for MRI to Amyloid PET Synthesis

Alec Sargood, Lemuel Puglisi, James H. Cole, Neil P. Oxtoby, Daniele Ravì, Daniel C. Alexander

TL;DR

CoCoLIT tackles the challenge of generating amyloid PET from structural MRI to enable scalable AD screening. It combines latent diffusion modeling with ControlNet conditioning, introducing a Weighted Image Space Loss (WISL) and a theoretical/empirical treatment of Latent Average Stabilization (LAS) to enable efficient, high-fidelity MRI-to-PET translation in 3D. The approach attains state-of-the-art performance on both image-based and amyloid-related metrics, including substantially improved Aβ-positivity classification, while offering a principled LAS framework that justifies using a latent-mean decode with a reduced number of samples. By making the model training and inference more efficient and robust, CoCoLIT holds promise for clinical translation and can generalize to other cross-modality or disease-trajectory tasks in medical imaging.

Abstract

Synthesizing amyloid PET scans from the more widely available and accessible structural MRI modality offers a promising, cost-effective approach for large-scale Alzheimer's Disease (AD) screening. This is motivated by evidence that, while MRI does not directly detect amyloid pathology, it may nonetheless encode information correlated with amyloid deposition that can be uncovered through advanced modeling. However, the high dimensionality and structural complexity of 3D neuroimaging data pose significant challenges for existing MRI-to-PET translation methods. Modeling the cross-modality relationship in a lower-dimensional latent space can simplify the learning task and enable more effective translation. As such, we present CoCoLIT (ControlNet-Conditioned Latent Image Translation), a diffusion-based latent generative framework that incorporates three main innovations: (1) a novel Weighted Image Space Loss (WISL) that improves latent representation learning and synthesis quality; (2) a theoretical and empirical analysis of Latent Average Stabilization (LAS), an existing technique used in similar generative models to enhance inference consistency; and (3) the introduction of ControlNet-based conditioning for MRI-to-PET translation. We evaluate CoCoLIT's performance on publicly available datasets and find that our model significantly outperforms state-of-the-art methods on both image-based and amyloid-related metrics. Notably, in amyloid-positivity classification, CoCoLIT outperforms the second-best method with improvements of +10.5% on the internal dataset and +23.7% on the external dataset. The code and models of our approach are available at https://github.com/brAIn-science/CoCoLIT.

CoCoLIT: ControlNet-Conditioned Latent Image Translation for MRI to Amyloid PET Synthesis

TL;DR

CoCoLIT tackles the challenge of generating amyloid PET from structural MRI to enable scalable AD screening. It combines latent diffusion modeling with ControlNet conditioning, introducing a Weighted Image Space Loss (WISL) and a theoretical/empirical treatment of Latent Average Stabilization (LAS) to enable efficient, high-fidelity MRI-to-PET translation in 3D. The approach attains state-of-the-art performance on both image-based and amyloid-related metrics, including substantially improved Aβ-positivity classification, while offering a principled LAS framework that justifies using a latent-mean decode with a reduced number of samples. By making the model training and inference more efficient and robust, CoCoLIT holds promise for clinical translation and can generalize to other cross-modality or disease-trajectory tasks in medical imaging.

Abstract

Synthesizing amyloid PET scans from the more widely available and accessible structural MRI modality offers a promising, cost-effective approach for large-scale Alzheimer's Disease (AD) screening. This is motivated by evidence that, while MRI does not directly detect amyloid pathology, it may nonetheless encode information correlated with amyloid deposition that can be uncovered through advanced modeling. However, the high dimensionality and structural complexity of 3D neuroimaging data pose significant challenges for existing MRI-to-PET translation methods. Modeling the cross-modality relationship in a lower-dimensional latent space can simplify the learning task and enable more effective translation. As such, we present CoCoLIT (ControlNet-Conditioned Latent Image Translation), a diffusion-based latent generative framework that incorporates three main innovations: (1) a novel Weighted Image Space Loss (WISL) that improves latent representation learning and synthesis quality; (2) a theoretical and empirical analysis of Latent Average Stabilization (LAS), an existing technique used in similar generative models to enhance inference consistency; and (3) the introduction of ControlNet-based conditioning for MRI-to-PET translation. We evaluate CoCoLIT's performance on publicly available datasets and find that our model significantly outperforms state-of-the-art methods on both image-based and amyloid-related metrics. Notably, in amyloid-positivity classification, CoCoLIT outperforms the second-best method with improvements of +10.5% on the internal dataset and +23.7% on the external dataset. The code and models of our approach are available at https://github.com/brAIn-science/CoCoLIT.

Paper Structure

This paper contains 34 sections, 26 equations, 2 figures, 3 tables, 1 algorithm.

Figures (2)

  • Figure 1: Overview of the CoCoLIT framework. (A–B) Training of the MRI and PET VAEs. (C) Training of the unconditional LDM on PET latents. (D) Training of the ControlNet and fine-tuning of the PET VAE decoder using standard noise loss and WISL. (E) Inference process in CoCoLIT, including the LAS algorithm.
  • Figure 2: Qualitative comparison of SUVR maps predicted from structural MRIs using CoCoLIT and baseline methods on both internal and external test sets. The color bar on the right indicates SUVR values ranging from 0.0 to 2.5.