Table of Contents
Fetching ...

DisQ-HNet: A Disentangled Quantized Half-UNet for Interpretable Multimodal Image Synthesis Applications to Tau-PET Synthesis from T1 and FLAIR MRI

Agamdeep S. Chopra, Caitlin Neher, Tianyi Ren, Juampablo E. Heras Rivera, Mehmet Kurt

TL;DR

DisQ-HNet (DQH) is introduced, a framework that synthesizes tau-PET from paired T1-weighted and FLAIR MRI while exposing how each modality contributes to the prediction, providing modality-specific attribution of synthesized uptake patterns.

Abstract

Tau positron emission tomography (tau-PET) provides an in vivo marker of Alzheimer's disease pathology, but cost and limited availability motivate MRI-based alternatives. We introduce DisQ-HNet (DQH), a framework that synthesizes tau-PET from paired T1-weighted and FLAIR MRI while exposing how each modality contributes to the prediction. The method combines (i) a Partial Information Decomposition (PID)-guided, vector-quantized encoder that partitions latent information into redundant, unique, and complementary components, and (ii) a Half-UNet decoder that preserves anatomical detail using pseudo-skip connections conditioned on structural edge cues rather than direct encoder feature reuse. Across multiple baselines (VAE, VQ-VAE, and UNet), DisQ-HNet maintains reconstruction fidelity and better preserves disease-relevant signal for downstream AD tasks, including Braak staging, tau localization, and classification. PID-based Shapley analysis provides modality-specific attribution of synthesized uptake patterns.

DisQ-HNet: A Disentangled Quantized Half-UNet for Interpretable Multimodal Image Synthesis Applications to Tau-PET Synthesis from T1 and FLAIR MRI

TL;DR

DisQ-HNet (DQH) is introduced, a framework that synthesizes tau-PET from paired T1-weighted and FLAIR MRI while exposing how each modality contributes to the prediction, providing modality-specific attribution of synthesized uptake patterns.

Abstract

Tau positron emission tomography (tau-PET) provides an in vivo marker of Alzheimer's disease pathology, but cost and limited availability motivate MRI-based alternatives. We introduce DisQ-HNet (DQH), a framework that synthesizes tau-PET from paired T1-weighted and FLAIR MRI while exposing how each modality contributes to the prediction. The method combines (i) a Partial Information Decomposition (PID)-guided, vector-quantized encoder that partitions latent information into redundant, unique, and complementary components, and (ii) a Half-UNet decoder that preserves anatomical detail using pseudo-skip connections conditioned on structural edge cues rather than direct encoder feature reuse. Across multiple baselines (VAE, VQ-VAE, and UNet), DisQ-HNet maintains reconstruction fidelity and better preserves disease-relevant signal for downstream AD tasks, including Braak staging, tau localization, and classification. PID-based Shapley analysis provides modality-specific attribution of synthesized uptake patterns.
Paper Structure (32 sections, 20 equations, 8 figures, 8 tables)

This paper contains 32 sections, 20 equations, 8 figures, 8 tables.

Figures (8)

  • Figure 1: Diagram illustrating the decomposition of mutual information between two inputs $(v_1, v_2)$ and an output $u$. $R(u; v_1, v_2)$ is the redundant information about $u$ shared by both inputs, $U_{v_1}(u; v_1 | v_2)$ and $U_{v_2}(u; v_2 | v_1)$ are the unique contributions, and $C(u; v_1, v_2)$ is the complementary (synergistic) contribution. In classical PID these four non-negative terms exactly account for $I(u; v_1, v_2)$.
  • Figure 2: Architecture of inception-based multi-kernel convolution layer (MKConv).
  • Figure 3: DisQ-HNet (DQH) with PID-factorized latents and a structural decoder. T1 and FLAIR are encoded by a shared Encoder$_{RU}$ to produce $z^{\mathrm{T1}}_{RU}$ and $z^{\mathrm{FL}}_{RU}$, and by Encoder$_C$ (on concatenated inputs) to produce the complementary latent $z_C$. Vector quantization (VQ) discretizes these latents. A PID module decomposes the shared representations into redundant $z_R$ and unique components $z_{U_{T1}}$ and $z_{U_{FL}}$, which are concatenated with $z_C$ to condition the decoder. The decoder synthesizes PET via stacked MKConv blocks with downsampling ($\downarrow$) in the encoders and upsampling ($\uparrow$) in the decoder. Pseudo-skip connections upsample structural component of redundant latent using lightweight convolution of kernel size 3, stride 1, and padding 1. These feature maps are conditioned on structural/anatomical features of the input using lightweight train time convolution 1x1x1 matching outputs of pyramid comprising of sobel-based gradient edge extraction and downsampling
  • Figure 4: Bland--Altman plots of percent regional SUVR error for six models. Each point corresponded to one ROI from one subject. Horizontal lines indicated bias and 95% limits of agreement.
  • Figure 5: Stage-stratified regional SUVR bias patterns across synthesis models.
  • ...and 3 more figures