Table of Contents
Fetching ...

Language-Enhanced Generative Modeling for Amyloid PET Synthesis from MRI and Blood Biomarkers

Zhengjie Zhang, Xiaoxie Mao, Qihao Guo, Shaoting Zhang, Qi Huang, Mu Zhou, Fang Xie, Mianxin Liu

TL;DR

This study tackles the cost and accessibility barriers of Abeta-PET by predicting Abeta-PET spatial patterns from MRI and blood-based biomarkers (BBMs) using a language-enhanced generative framework. A language-enhanced encoder converts non-imaging clinical data into semantic features via a medical LLM, which guides a GAN-based PET synthesis from MRI, enabling high-fidelity PET images that preserve diagnostic features. The synthetic PETs achieve close structural and regional fidelity to real scans and support a fully AI-driven AD diagnostic pipeline that outperforms MRI- and BBM-only baselines; combining synthetic PET with BBMs further improves accuracy. The work demonstrates a cost-effective, scalable approach for large-scale AD screening and suggests avenues for extending the framework to other biomarkers and PET modalities in the future.

Abstract

Background: Alzheimer's disease (AD) diagnosis heavily relies on amyloid-beta positron emission tomography (Abeta-PET), which is limited by high cost and limited accessibility. This study explores whether Abeta-PET spatial patterns can be predicted from blood-based biomarkers (BBMs) and MRI scans. Methods: We collected Abeta-PET images, T1-weighted MRI scans, and BBMs from 566 participants. A language-enhanced generative model, driven by a large language model (LLM) and multimodal information fusion, was developed to synthesize PET images. Synthesized images were evaluated for image quality, diagnostic consistency, and clinical applicability within a fully automated diagnostic pipeline. Findings: The synthetic PET images closely resemble real PET scans in both structural details (SSIM = 0.920 +/- 0.003) and regional patterns (Pearson's r = 0.955 +/- 0.007). Diagnostic outcomes using synthetic PET show high agreement with real PET-based diagnoses (accuracy = 0.80). Using synthetic PET, we developed a fully automatic AD diagnostic pipeline integrating PET synthesis and classification. The synthetic PET-based model (AUC = 0.78) outperforms T1-based (AUC = 0.68) and BBM-based (AUC = 0.73) models, while combining synthetic PET and BBMs further improved performance (AUC = 0.79). Ablation analysis supports the advantages of LLM integration and prompt engineering. Interpretation: Our language-enhanced generative model synthesizes realistic PET images, enhancing the utility of MRI and BBMs for Abeta spatial pattern assessment and improving the diagnostic workflow for Alzheimer's disease.

Language-Enhanced Generative Modeling for Amyloid PET Synthesis from MRI and Blood Biomarkers

TL;DR

This study tackles the cost and accessibility barriers of Abeta-PET by predicting Abeta-PET spatial patterns from MRI and blood-based biomarkers (BBMs) using a language-enhanced generative framework. A language-enhanced encoder converts non-imaging clinical data into semantic features via a medical LLM, which guides a GAN-based PET synthesis from MRI, enabling high-fidelity PET images that preserve diagnostic features. The synthetic PETs achieve close structural and regional fidelity to real scans and support a fully AI-driven AD diagnostic pipeline that outperforms MRI- and BBM-only baselines; combining synthetic PET with BBMs further improves accuracy. The work demonstrates a cost-effective, scalable approach for large-scale AD screening and suggests avenues for extending the framework to other biomarkers and PET modalities in the future.

Abstract

Background: Alzheimer's disease (AD) diagnosis heavily relies on amyloid-beta positron emission tomography (Abeta-PET), which is limited by high cost and limited accessibility. This study explores whether Abeta-PET spatial patterns can be predicted from blood-based biomarkers (BBMs) and MRI scans. Methods: We collected Abeta-PET images, T1-weighted MRI scans, and BBMs from 566 participants. A language-enhanced generative model, driven by a large language model (LLM) and multimodal information fusion, was developed to synthesize PET images. Synthesized images were evaluated for image quality, diagnostic consistency, and clinical applicability within a fully automated diagnostic pipeline. Findings: The synthetic PET images closely resemble real PET scans in both structural details (SSIM = 0.920 +/- 0.003) and regional patterns (Pearson's r = 0.955 +/- 0.007). Diagnostic outcomes using synthetic PET show high agreement with real PET-based diagnoses (accuracy = 0.80). Using synthetic PET, we developed a fully automatic AD diagnostic pipeline integrating PET synthesis and classification. The synthetic PET-based model (AUC = 0.78) outperforms T1-based (AUC = 0.68) and BBM-based (AUC = 0.73) models, while combining synthetic PET and BBMs further improved performance (AUC = 0.79). Ablation analysis supports the advantages of LLM integration and prompt engineering. Interpretation: Our language-enhanced generative model synthesizes realistic PET images, enhancing the utility of MRI and BBMs for Abeta spatial pattern assessment and improving the diagnostic workflow for Alzheimer's disease.

Paper Structure

This paper contains 25 sections, 4 equations, 8 figures, 1 table.

Figures (8)

  • Figure 1: Overview of our proposed language-enhanced generative framework for PET synthesis. (A) Language-enhanced encoder: Clinical variables including demographics, BBMs, and NAs are formatted into a global-context prompt, which begins with a global PET characteristics statement (ground-truth diagnosis for training and clinical-variable-based prediction for inference, see Methods), followed by structured clinical details. The prompt is encoded into text features using a medical LLM. (B) Generative adversarial network: The generator uses T1 images along with text features to synthesize Aβ-PET images. The discriminator evaluates the authenticity of PET images by comparing them with paired T1 images and text features only during training stage. (C) Architecture comparison: This component illustrates the key differences between our method and two representative baselines. Our approach ("T1+BBMs-LLM") synthesizes PET images using T1-weighted images and BBMs encoded as text features via an LLM. In contrast, the "T1-only" baseline uses only T1 images, while the "T1+BBMs-Num" baseline incorporates BBMs as normalized numerical features.
  • Figure 2: Detailed architecture of the proposed framework. (A) Generator: The generator employs a three-layer U-Net architecture. The encoder includes three downsampling blocks with double convolutional layers and max pooling. The encoded T1 features are then fused with text features. The decoder features three upsampling blocks with trilinear interpolation and double convolutional layers, incorporating skip connections to retain feature information. Finally, the decoder output is processed through a convolutional layer to synthesize the PET image. (B) Discriminator: The discriminator begins by concatenating the T1 and PET image, followed by feature extraction using four convolutional layers. These image features are integrated with text features. The combined features are subsequently processed through a convolutional layer and a fully connected layer to produce the authenticity judgment for each PET image. (C) Fusion Module: In the generator fusion module, text features generate scale and shift parameters through non-linear multilayer perceptrons (MLPs), which adjust image features on a channel-wise basis. In the discriminator fusion module, text features are first reduced in dimensionality through a linear layer, then expanded by repeating the channels to fill the remaining dimensions, and finally concatenated with the image features.
  • Figure 3: Overview of the diagnostic consistency evaluation framework. The physician-based evaluation uses a representative subset and a multi-reader arbitration workflow, while the model-based evaluation scales the assessment to the full dataset using a diagnostic model trained on real PET images.
  • Figure 4: Evaluation of image quality of the synthetic PET image. (A). Violin plots for comparing image quality assessment metrics distributions in individual predictions across methods. **: indicates significance at P < 0.01. ***: P < 0.001. (B) Visual comparison for investigating the methodological differences through case studies. From left to right, it includes T1, synthetic PET images from three methods, and real PET images. PET images are displayed using range-normalized SUVRs with pseudocolors for clarity. Areas with obvious differences are highlighted with annotations, as boxes or circles, in the images. Below each synthetic PET image, corresponding error maps are provided, indicating discrepancies measured by absolute differences.
  • Figure 5: Region-based evaluation metrics for different methods. (A). Scatter plot between ground-truth and predicted regional SUVRs for all subject with inter-region correlations and (B) absolute errors in different brain regions. The brain is divided into 116 grey-matter regions based on the AAL-116 atlas and 4 white-matter regions. The correlation and absolute error between the mean SUVR of the synthetic PET and real PET images for each region are shown. In B, darker colors representing larger errors. Each row corresponds to a method, with the last row representing our method.
  • ...and 3 more figures