CoCoLIT: ControlNet-Conditioned Latent Image Translation for MRI to Amyloid PET Synthesis
Alec Sargood, Lemuel Puglisi, James H. Cole, Neil P. Oxtoby, Daniele Ravì, Daniel C. Alexander
TL;DR
CoCoLIT tackles the challenge of generating amyloid PET from structural MRI to enable scalable AD screening. It combines latent diffusion modeling with ControlNet conditioning, introducing a Weighted Image Space Loss (WISL) and a theoretical/empirical treatment of Latent Average Stabilization (LAS) to enable efficient, high-fidelity MRI-to-PET translation in 3D. The approach attains state-of-the-art performance on both image-based and amyloid-related metrics, including substantially improved Aβ-positivity classification, while offering a principled LAS framework that justifies using a latent-mean decode with a reduced number of samples. By making the model training and inference more efficient and robust, CoCoLIT holds promise for clinical translation and can generalize to other cross-modality or disease-trajectory tasks in medical imaging.
Abstract
Synthesizing amyloid PET scans from the more widely available and accessible structural MRI modality offers a promising, cost-effective approach for large-scale Alzheimer's Disease (AD) screening. This is motivated by evidence that, while MRI does not directly detect amyloid pathology, it may nonetheless encode information correlated with amyloid deposition that can be uncovered through advanced modeling. However, the high dimensionality and structural complexity of 3D neuroimaging data pose significant challenges for existing MRI-to-PET translation methods. Modeling the cross-modality relationship in a lower-dimensional latent space can simplify the learning task and enable more effective translation. As such, we present CoCoLIT (ControlNet-Conditioned Latent Image Translation), a diffusion-based latent generative framework that incorporates three main innovations: (1) a novel Weighted Image Space Loss (WISL) that improves latent representation learning and synthesis quality; (2) a theoretical and empirical analysis of Latent Average Stabilization (LAS), an existing technique used in similar generative models to enhance inference consistency; and (3) the introduction of ControlNet-based conditioning for MRI-to-PET translation. We evaluate CoCoLIT's performance on publicly available datasets and find that our model significantly outperforms state-of-the-art methods on both image-based and amyloid-related metrics. Notably, in amyloid-positivity classification, CoCoLIT outperforms the second-best method with improvements of +10.5% on the internal dataset and +23.7% on the external dataset. The code and models of our approach are available at https://github.com/brAIn-science/CoCoLIT.
