Explainability in Generative Medical Diffusion Models: A Faithfulness-Based Analysis on MRI Synthesis

Surjo Dey; Pallabi Saikia

Explainability in Generative Medical Diffusion Models: A Faithfulness-Based Analysis on MRI Synthesis

Surjo Dey, Pallabi Saikia

TL;DR

This paper tackles the limited explainability of diffusion-based MRI synthesis by introducing a faithfulness-based prototype framework that links diffusion-generated features to training prototypes. It combines Denoising Diffusion Probabilistic Models (DDPMs) with prototype-based networks PPNet, Enhanced ProtoPNet (EPPNet), and ProtoPool, and defines a Faithfulness Score to quantify alignment between generated outputs and training data. Empirical results on the DUKE Breast MRI dataset show high image fidelity (PSNR $=19.37\pm1.67$, SSIM $=0.6530\pm0.1052$, LPIPS $=0.2893\pm0.1050$) and demonstrate that EPPNet yields the strongest faithfulness ($F=0.1534$) among the methods. The findings indicate diffusion models can be both accurate and transparent, advancing safe and trustworthy AI applications in medical imaging by revealing the denoising trajectory and prototype associations that drive MRI synthesis.

Abstract

This study investigates the explainability of generative diffusion models in the context of medical imaging, focusing on Magnetic resonance imaging (MRI) synthesis. Although diffusion models have shown strong performance in generating realistic medical images, their internal decision making process remains largely opaque. We present a faithfulness-based explainability framework that analyzes how prototype-based explainability methods like ProtoPNet (PPNet), Enhanced ProtoPNet (EPPNet), and ProtoPool can link the relationship between generated and training features. Our study focuses on understanding the reasoning behind image formation through denoising trajectory of diffusion model and subsequently prototype explainability with faithfulness analysis. Experimental analysis shows that EPPNet achieves the highest faithfulness (with score 0.1534), offering more reliable insights, and explainability into the generative process. The results highlight that diffusion models can be made more transparent and trustworthy through faithfulness-based explanations, contributing to safer and more interpretable applications of generative AI in healthcare.

Explainability in Generative Medical Diffusion Models: A Faithfulness-Based Analysis on MRI Synthesis

TL;DR

, SSIM

, LPIPS

) and demonstrate that EPPNet yields the strongest faithfulness (

) among the methods. The findings indicate diffusion models can be both accurate and transparent, advancing safe and trustworthy AI applications in medical imaging by revealing the denoising trajectory and prototype associations that drive MRI synthesis.

Abstract

Paper Structure (19 sections, 14 equations, 5 figures, 1 table)

This paper contains 19 sections, 14 equations, 5 figures, 1 table.

Introduction
Methodology
Diffusion Model Architecture for MRI Synthesis
Prototype-Based Explainability Framework
ProtoPNet (PPNet)
Enhanced ProtoPNet (EPPNet)
ProtoPool
Faithfulness Evaluation Metrics
Peak Signal to Noise Ratio (PSNR)
Structural Similarity Index (SSIM)
Learned Perceptual Image Patch Similarity (LPIPS)
Faithfulness Score
Results and Discussion
Dataset
Experimental Setup
...and 4 more sections

Figures (5)

Figure 1: Overall architecture of the proposed diffusion based explainability framework. The process begins with prototype based interpretation of extracted anatomical features. The learned prototypes identify feature level correspondences between synthetic and real MRI regions, providing an interpretable understanding of the generative process.
Figure 2: Comparison of real and synthetic breast MRI images generated with Diffusion Model maintaining high fidelity and anatomical consistency.
Figure 3: Visualization of the denoising trajectory across diffusion timesteps. Each frame represents the predicted noise magnitude at a specific stage of the generative process. The progression demonstrates how the trained diffusion model gradually reconstructs the anatomical features of breast MRI images through iterative denoising.
Figure 4: Comparison of faithfulness scores among PPNet, EPPNet, and ProtoPool on synthetic breast MRI images. The bar chart highlights that EPPNet achieved the highest faithfulness score, indicating stronger alignment between prototype activations and the underlying diffusion generation process.
Figure 5: Distribution of Normalized Influence Scores (NIS) across generated samples. The plot illustrates how prototype activations vary among PPNet, EPPNet, and ProtoPool, showing that EPPNet maintains the most balanced and distinct influence distribution across prototypes.

Explainability in Generative Medical Diffusion Models: A Faithfulness-Based Analysis on MRI Synthesis

TL;DR

Abstract

Explainability in Generative Medical Diffusion Models: A Faithfulness-Based Analysis on MRI Synthesis

Authors

TL;DR

Abstract

Table of Contents

Figures (5)