Table of Contents
Fetching ...

Align-cDAE: Alzheimer's Disease Progression Modeling with Attention-Aligned Conditional Diffusion Auto-Encoder

Ayantika Das, Keerthi Ram, Mohanasankar Sivaprakasam

TL;DR

This work proposes a diffusion autoencoder-based framework for disease progression modeling that explicitly enforces alignment between different modalities and devise a mechanism to better structure the latent representational space of the diffusion auto-encoding framework.

Abstract

Generative AI framework-based modeling and prediction of longitudinal human brain images offer an efficient mechanism to track neurodegenerative progression, essential for the assessment of diseases like Alzheimer's. Among the existing generative approaches, recent diffusion-based models have emerged as an effective alternative to generate disease progression images. Incorporating multi-modal and non-imaging attributes as conditional information into diffusion frameworks has been shown to improve controllability during such generations. However, existing methods do not explicitly ensure that information from non-imaging conditioning modalities is meaningfully aligned with image features to introduce desirable changes in the generated images, such as modulation of progression-specific regions. Further, more precise control over the generation process can be achieved by introducing progression-relevant structure into the internal representations of the model, lacking in the existing approaches. To address these limitations, we propose a diffusion autoencoder-based framework for disease progression modeling that explicitly enforces alignment between different modalities. The alignment is enforced by introducing an explicit objective function that enables the model to focus on the regions exhibiting progression-related changes. Further, we devise a mechanism to better structure the latent representational space of the diffusion auto-encoding framework. Specifically, we assign separate latent subspaces for integrating progression-related conditions and retaining subject-specific identity information, allowing better-controlled image generation. These results demonstrate that enforcing alignment and better structuring of the latent representational space of diffusion auto-encoding framework leads to more anatomically precise modeling of Alzheimer's disease progression.

Align-cDAE: Alzheimer's Disease Progression Modeling with Attention-Aligned Conditional Diffusion Auto-Encoder

TL;DR

This work proposes a diffusion autoencoder-based framework for disease progression modeling that explicitly enforces alignment between different modalities and devise a mechanism to better structure the latent representational space of the diffusion auto-encoding framework.

Abstract

Generative AI framework-based modeling and prediction of longitudinal human brain images offer an efficient mechanism to track neurodegenerative progression, essential for the assessment of diseases like Alzheimer's. Among the existing generative approaches, recent diffusion-based models have emerged as an effective alternative to generate disease progression images. Incorporating multi-modal and non-imaging attributes as conditional information into diffusion frameworks has been shown to improve controllability during such generations. However, existing methods do not explicitly ensure that information from non-imaging conditioning modalities is meaningfully aligned with image features to introduce desirable changes in the generated images, such as modulation of progression-specific regions. Further, more precise control over the generation process can be achieved by introducing progression-relevant structure into the internal representations of the model, lacking in the existing approaches. To address these limitations, we propose a diffusion autoencoder-based framework for disease progression modeling that explicitly enforces alignment between different modalities. The alignment is enforced by introducing an explicit objective function that enables the model to focus on the regions exhibiting progression-related changes. Further, we devise a mechanism to better structure the latent representational space of the diffusion auto-encoding framework. Specifically, we assign separate latent subspaces for integrating progression-related conditions and retaining subject-specific identity information, allowing better-controlled image generation. These results demonstrate that enforcing alignment and better structuring of the latent representational space of diffusion auto-encoding framework leads to more anatomically precise modeling of Alzheimer's disease progression.
Paper Structure (16 sections, 4 equations, 4 figures, 4 tables)

This paper contains 16 sections, 4 equations, 4 figures, 4 tables.

Figures (4)

  • Figure 1: Flow diagram indicating the desirable information to be modeled in the latent representational space of our approach.
  • Figure 2: Left to right: The condition encoder ($\mathcal{C}$) integrates progression information with the latent representation of baseline image ($x_b$) produced by the encoding component ($\mathcal{E}$), guiding the denoising decoder ($\mathcal{D}$) to generate follow-up image ($\hat{x}_f$). Cross-attention ($A_l$) is computed between the conditioning vector and decoder layers, and the objective functions enforce alignment of $A_l$ with the progression-specific mask ($M$).
  • Figure 3: Left to right: In sub-figures (A) and (B), the columns (a)-(f) present a method-wise comparison of predicted follow-up images ($\hat{x}_f$) along with their error maps computed with respect to the ground-truth follow-up ($|\hat{x}_f - x_f|$) and ground truth baseline ($|x_b - \hat{x}_f|$), respectively. Column (g) in sub-figure (A) shows the ground-truth follow-up (${x}_f$), while in sub-figure (B) it also shows the ground-truth follow-up and baseline difference ($|x_b - x_f|$). Top to bottom: For both sub-figures, the upper and lower two rows correspond to subjects aged 83.3 and 84.3 years, respectively, from the AD disease category. In sub-figure (A), the green box highlights that Align-cDAE better predicts progression-related changes.
  • Figure 4: Left to right: (a) Ground truth, (b) cDAE, and (c) Align-cDAE. Top to bottom: (i) Ground-truth and predicted follow-up images, (ii) (a) Ground truth baseline and (b)-(c) model attention maps overlaid on images, and (iii) Difference between ground-truth/ predicted follow-up and ground-truth baseline images.