Semantically Conditioned Diffusion Models for Cerebral DSA Synthesis

Qiwen Xu; David Rügamer; Holger Wenz; Johann Fontana; Nora Meggyeshazi; Andreas Bender; Máté E. Maros

Semantically Conditioned Diffusion Models for Cerebral DSA Synthesis

Qiwen Xu, David Rügamer, Holger Wenz, Johann Fontana, Nora Meggyeshazi, Andreas Bender, Máté E. Maros

TL;DR

This work addresses the scarcity of publicly shareable cerebral DSA data by introducing a semantically conditioned latent diffusion model that generates arterial-phase DSA frames under explicit anatomical and acquisition-control prompts. Trained on a large single-centre dataset, the model combines a latent VAE with cross-attention to BERT-based conditioning, enabling controlled synthesis across circulation and C-arm plane/angles. Clinical evaluation shows the synthetic frames achieve plausible realism (reader ratings) and favorable distributional alignment with real DSAs (FID = 15.27), while highlighting challenges in distal vasculature and underrepresented viewpoints. The approach offers a privacy-preserving data source for algorithm development, training, and simulation in neurointerventional research, with future work needed to generalize across centers, extend to sequences, and assess impact on downstream tasks.

Abstract

Digital subtraction angiography (DSA) plays a central role in the diagnosis and treatment of cerebrovascular disease, yet its invasive nature and high acquisition cost severely limit large-scale data collection and public data sharing. Therefore, we developed a semantically conditioned latent diffusion model (LDM) that synthesizes arterial-phase cerebral DSA frames under explicit control of anatomical circulation (anterior vs.\ posterior) and canonical C-arm positions. We curated a large single-centre DSA dataset of 99,349 frames and trained a conditional LDM using text embeddings that encoded anatomy and acquisition geometry. To assess clinical realism, four medical experts, including two neuroradiologists, one neurosurgeon, and one internal medicine expert, systematically rated 400 synthetic DSA images using a 5-grade Likert scale for evaluating proximal large, medium, and small peripheral vessels. The generated images achieved image-wise overall Likert scores ranging from 3.1 to 3.3, with high inter-rater reliability (ICC(2,k) = 0.80--0.87). Distributional similarity to real DSA frames was supported by a low median Fréchet inception distance (FID) of 15.27. Our results indicate that semantically controlled LDMs can produce realistic synthetic DSAs suitable for downstream algorithm development, research, and training.

Semantically Conditioned Diffusion Models for Cerebral DSA Synthesis

TL;DR

Abstract

Paper Structure (24 sections, 8 equations, 5 figures, 5 tables)

This paper contains 24 sections, 8 equations, 5 figures, 5 tables.

Introduction
Results
Study cohort
Cohort selection and Data preprocessing
Synthetic DSA Data Generation
Clinical Assessment
Distributional Analysis of Likert Ratings
Quantitative Evaluation
Discussion
Methods
Study approval
DSA Arterial-Phase Classification
Unconditional Generation
Latent Autoencoder
Latent Diffusion Process
...and 9 more sections

Figures (5)

Figure 1: Data filtering and preprocessing workflow for constructing the arterial-phase DSA dataset
Figure 2: Representative arterial-phase DSA frames from the clinical dataset (left column) and from the conditional diffusion model (right column)
Figure 3: Data flow from generation conditions to evaluation outcomes
Figure 4: Reader study Likert ratings by conditioning setting and vessel segment. (a) Stacked proportional bar plots showing the distribution of raw 1–5 Likert ratings for each arterial segment (prox., med., peri.) across the four conditioning settings (AC/PC $\times$ Plane A/B). (b) Boxplots of image-wise segment scores, obtained by averaging the available Likert ratings across raters for each image. Boxes indicate the interquartile range with median lines, whiskers extend to 1.5 IQR, and points indicate outlying images beyond the whiskers. (c) Heatmap of the mean image-wise Likert scores across conditions and segments. Each cell shows the average rating (1–5). (d) Rater-wise mean Likert scores with 95% confidence intervals. Raters include NR (neuroradiologist), NS (neurosurgeon), and IM (internal medicine expert). (e) Relationship between dataset richness and perceived image quality across conditioning settings
Figure 5: Overview of the semantically conditioned latent diffusion framework for DSA synthesis

Semantically Conditioned Diffusion Models for Cerebral DSA Synthesis

TL;DR

Abstract

Semantically Conditioned Diffusion Models for Cerebral DSA Synthesis

Authors

TL;DR

Abstract

Table of Contents

Figures (5)