Table of Contents
Fetching ...

Anatomy-Preserving Latent Diffusion for Generation of Brain Segmentation Masks with Ischemic Infarct

Lucia Borrego, Vajira Thambawita, Marco Ciuffreda, Ines del Val, Alejandro Dominguez, Josep Munuera

TL;DR

The paper tackles data scarcity for brain segmentation masks in NCCT by introducing an anatomy-preserving latent diffusion framework that decouples anatomical structure learning from stochastic generation. It combines a MaskVAE trained exclusively on segmentation masks as an explicit anatomical prior with a diffusion model operating in the latent space to synthesize unconditional multi-class brain masks, including ischemic infarcts, from pure noise and a binary lesion prompt. Inference decodes denoised latent codes through the frozen VAE to produce anatomically coherent masks, with qualitative results showing preserved global anatomy and tissue semantics and distributional analyses indicating realistic class proportions. The authors release a synthetic dataset of 605 masks and pre-trained models to support data augmentation in annotation-scarce neuroimaging scenarios and provide a practical, scalable approach for anatomy-aware mask generation.

Abstract

The scarcity of high-quality segmentation masks remains a major bottleneck for medical image analysis, particularly in non-contrast CT (NCCT) neuroimaging, where manual annotation is costly and variable. To address this limitation, we propose an anatomy-preserving generative framework for the unconditional synthesis of multi-class brain segmentation masks, including ischemic infarcts. The proposed approach combines a variational autoencoder trained exclusively on segmentation masks to learn an anatomical latent representation, with a diffusion model operating in this latent space to generate new samples from pure noise. At inference, synthetic masks are obtained by decoding denoised latent vectors through the frozen VAE decoder, with optional coarse control over lesion presence via a binary prompt. Qualitative results show that the generated masks preserve global brain anatomy, discrete tissue semantics, and realistic variability, while avoiding the structural artifacts commonly observed in pixel-space generative models. Overall, the proposed framework offers a simple and scalable solution for anatomy-aware mask generation in data-scarce medical imaging scenarios.

Anatomy-Preserving Latent Diffusion for Generation of Brain Segmentation Masks with Ischemic Infarct

TL;DR

The paper tackles data scarcity for brain segmentation masks in NCCT by introducing an anatomy-preserving latent diffusion framework that decouples anatomical structure learning from stochastic generation. It combines a MaskVAE trained exclusively on segmentation masks as an explicit anatomical prior with a diffusion model operating in the latent space to synthesize unconditional multi-class brain masks, including ischemic infarcts, from pure noise and a binary lesion prompt. Inference decodes denoised latent codes through the frozen VAE to produce anatomically coherent masks, with qualitative results showing preserved global anatomy and tissue semantics and distributional analyses indicating realistic class proportions. The authors release a synthetic dataset of 605 masks and pre-trained models to support data augmentation in annotation-scarce neuroimaging scenarios and provide a practical, scalable approach for anatomy-aware mask generation.

Abstract

The scarcity of high-quality segmentation masks remains a major bottleneck for medical image analysis, particularly in non-contrast CT (NCCT) neuroimaging, where manual annotation is costly and variable. To address this limitation, we propose an anatomy-preserving generative framework for the unconditional synthesis of multi-class brain segmentation masks, including ischemic infarcts. The proposed approach combines a variational autoencoder trained exclusively on segmentation masks to learn an anatomical latent representation, with a diffusion model operating in this latent space to generate new samples from pure noise. At inference, synthetic masks are obtained by decoding denoised latent vectors through the frozen VAE decoder, with optional coarse control over lesion presence via a binary prompt. Qualitative results show that the generated masks preserve global brain anatomy, discrete tissue semantics, and realistic variability, while avoiding the structural artifacts commonly observed in pixel-space generative models. Overall, the proposed framework offers a simple and scalable solution for anatomy-aware mask generation in data-scarce medical imaging scenarios.
Paper Structure (24 sections, 14 equations, 5 figures, 1 table)

This paper contains 24 sections, 14 equations, 5 figures, 1 table.

Figures (5)

  • Figure 1: Illustration of the slice selection strategy used in this study.
  • Figure 2: Example of multi-class tissue segmentation on a representative NCCT slice. CTseg was used to automatically segment normal tissue classes, whereas ischemic infarct regions correspond to manual annotations provided by expert annotators.
  • Figure 3: Overview of the proposed anatomy-preserving latent diffusion framework for synthetic brain mask generation using a frozen MaskVAE decoder and latent diffusion.
  • Figure 5: Pixel-wise class distribution comparison between real and synthetic brain segmentation masks evaluated on the test set. Distributions are shown for both lesion-free ($y=0$) and lesion-conditioned ($y=1$) samples.
  • Figure : Top row: Real masks