Table of Contents
Fetching ...

Anatomically-Controllable Medical Image Generation with Segmentation-Guided Diffusion Models

Nicholas Konz, Yuwen Chen, Haoyu Dong, Maciej A. Mazurowski

TL;DR

This work tackles the challenge of enforcing precise anatomical constraints in medical image generation. It introduces SegGuidedDiff, a segmentation-guided diffusion model that conditions on multi-class anatomical masks at every denoising step and employs a mask-ablated training strategy to handle partial masks. Across breast MRI and neck-to-pelvis CT datasets, SegGuidedDiff achieves state-of-the-art fidelity to input masks and competitive anatomical realism, with additional capability to tune anatomical similarity to real images via latent-space interpolation. The method enables applications such as anatomically paired data, cross-modality translation, and counterfactual data generation, offering a practical tool for medical image synthesis with controllable anatomy.

Abstract

Diffusion models have enabled remarkably high-quality medical image generation, yet it is challenging to enforce anatomical constraints in generated images. To this end, we propose a diffusion model-based method that supports anatomically-controllable medical image generation, by following a multi-class anatomical segmentation mask at each sampling step. We additionally introduce a random mask ablation training algorithm to enable conditioning on a selected combination of anatomical constraints while allowing flexibility in other anatomical areas. We compare our method ("SegGuidedDiff") to existing methods on breast MRI and abdominal/neck-to-pelvis CT datasets with a wide range of anatomical objects. Results show that our method reaches a new state-of-the-art in the faithfulness of generated images to input anatomical masks on both datasets, and is on par for general anatomical realism. Finally, our model also enjoys the extra benefit of being able to adjust the anatomical similarity of generated images to real images of choice through interpolation in its latent space. SegGuidedDiff has many applications, including cross-modality translation, and the generation of paired or counterfactual data. Our code is available at https://github.com/mazurowski-lab/segmentation-guided-diffusion.

Anatomically-Controllable Medical Image Generation with Segmentation-Guided Diffusion Models

TL;DR

This work tackles the challenge of enforcing precise anatomical constraints in medical image generation. It introduces SegGuidedDiff, a segmentation-guided diffusion model that conditions on multi-class anatomical masks at every denoising step and employs a mask-ablated training strategy to handle partial masks. Across breast MRI and neck-to-pelvis CT datasets, SegGuidedDiff achieves state-of-the-art fidelity to input masks and competitive anatomical realism, with additional capability to tune anatomical similarity to real images via latent-space interpolation. The method enables applications such as anatomically paired data, cross-modality translation, and counterfactual data generation, offering a practical tool for medical image synthesis with controllable anatomy.

Abstract

Diffusion models have enabled remarkably high-quality medical image generation, yet it is challenging to enforce anatomical constraints in generated images. To this end, we propose a diffusion model-based method that supports anatomically-controllable medical image generation, by following a multi-class anatomical segmentation mask at each sampling step. We additionally introduce a random mask ablation training algorithm to enable conditioning on a selected combination of anatomical constraints while allowing flexibility in other anatomical areas. We compare our method ("SegGuidedDiff") to existing methods on breast MRI and abdominal/neck-to-pelvis CT datasets with a wide range of anatomical objects. Results show that our method reaches a new state-of-the-art in the faithfulness of generated images to input anatomical masks on both datasets, and is on par for general anatomical realism. Finally, our model also enjoys the extra benefit of being able to adjust the anatomical similarity of generated images to real images of choice through interpolation in its latent space. SegGuidedDiff has many applications, including cross-modality translation, and the generation of paired or counterfactual data. Our code is available at https://github.com/mazurowski-lab/segmentation-guided-diffusion.
Paper Structure (22 sections, 1 equation, 6 figures, 2 tables)

This paper contains 22 sections, 1 equation, 6 figures, 2 tables.

Figures (6)

  • Figure 1: Standard diffusion models (right) can fail to create realistic tissue even if the overall image appears high-quality, motivating our segmentation-guided model (center).
  • Figure 2: Visual comparison of our model (SegGuidedDiff, or "Seg-Diff" for short) to existing segmentation-conditional image generation models. For breast MRI, the breast, BV, and FGT segmentations are shown as white, red, and blue, respectively, while for CT, the liver, bladder, lungs, kidneys, and bone are in maroon, orange, pink, red, and white, respectively. "MAT" = max ablated training, "STD" = our standard method.
  • Figure 3: Generating images (even rows) from masks with classes removed (odd rows), shown for breast MRI.
  • Figure 4: Using our model to generate images that are anatomically similar to real images.
  • Figure 5: Additional samples from all segmentation-conditional models; breast MRI on the left, CT organ on the right. Please see Fig. \ref{['fig:samples']} caption for more details.
  • ...and 1 more figures