Table of Contents
Fetching ...

XReal: Realistic Anatomy and Pathology-Aware X-ray Generation via Controllable Diffusion Model

Anees Ur Rehman Hashmi, Ibrahim Almakky, Mohammad Areeb Qazi, Santosh Sanjeev, Vijay Ram Papineni, Jagalpathy Jagdish, Mohammad Yaqub

TL;DR

XReal is presented, a novel controllable diffusion model for generating realistic chest X-ray images through precise anatomy and pathology location control that outperforms state-of-the-art X-ray diffusion models in quantitative metrics and radiologists' ratings.

Abstract

Large-scale generative models have demonstrated impressive capabilities in producing visually compelling images, with increasing applications in medical imaging. However, they continue to grapple with hallucination challenges and the generation of anatomically inaccurate outputs. These limitations are mainly due to the reliance on textual inputs and lack of spatial control over the generated images, hindering the potential usefulness of such models in real-life settings. In this work, we present XReal, a novel controllable diffusion model for generating realistic chest X-ray images through precise anatomy and pathology location control. Our lightweight method comprises an Anatomy Controller and a Pathology Controller to introduce spatial control over anatomy and pathology in a pre-trained Text-to-Image Diffusion Model, respectively, without fine-tuning the model. XReal outperforms state-of-the-art X-ray diffusion models in quantitative metrics and radiologists' ratings, showing significant gains in anatomy and pathology realism. Our model holds promise for advancing generative models in medical imaging, offering greater precision and adaptability while inviting further exploration in this evolving field. The code and pre-trained model weights are publicly available at https://github.com/BioMedIA-MBZUAI/XReal.

XReal: Realistic Anatomy and Pathology-Aware X-ray Generation via Controllable Diffusion Model

TL;DR

XReal is presented, a novel controllable diffusion model for generating realistic chest X-ray images through precise anatomy and pathology location control that outperforms state-of-the-art X-ray diffusion models in quantitative metrics and radiologists' ratings.

Abstract

Large-scale generative models have demonstrated impressive capabilities in producing visually compelling images, with increasing applications in medical imaging. However, they continue to grapple with hallucination challenges and the generation of anatomically inaccurate outputs. These limitations are mainly due to the reliance on textual inputs and lack of spatial control over the generated images, hindering the potential usefulness of such models in real-life settings. In this work, we present XReal, a novel controllable diffusion model for generating realistic chest X-ray images through precise anatomy and pathology location control. Our lightweight method comprises an Anatomy Controller and a Pathology Controller to introduce spatial control over anatomy and pathology in a pre-trained Text-to-Image Diffusion Model, respectively, without fine-tuning the model. XReal outperforms state-of-the-art X-ray diffusion models in quantitative metrics and radiologists' ratings, showing significant gains in anatomy and pathology realism. Our model holds promise for advancing generative models in medical imaging, offering greater precision and adaptability while inviting further exploration in this evolving field. The code and pre-trained model weights are publicly available at https://github.com/BioMedIA-MBZUAI/XReal.
Paper Structure (22 sections, 2 equations, 9 figures, 1 table, 1 algorithm)

This paper contains 22 sections, 2 equations, 9 figures, 1 table, 1 algorithm.

Figures (9)

  • Figure 1: X-ray generation using different diffusion models. As text-to-image models, RoentGen chambon2022roentgen and Cheff weber2023cascaded struggle to follow the pathology location information specified in the prompts and do not offer any anatomy control. Our proposed XReal model provides precise control over both anatomical and pathology manifestations through the use of input segmentation masks, significantly enhancing the clinical realism of generated X-ray images.
  • Figure 2: XReal has three components: 1) Anatomy Controller, 2) Latent Diffusion Model, and 3) Pathology Controller. It uses a two-stage process to generate the final image $\hat{x}_p$. The Anatomy Controller guides the LDM to generate image $\hat{x}_a$ based on the anatomy mask $m_a$ without using any textual input (text = "" or None). The Pathology Controller infuses the pathology $p$ (text = $p$) into $\hat{x}_a$ at $m_p$ to obtain the final image $\hat{x}_p$.
  • Figure 3: The top row shows the Latent space of VAE in LDM. The VAE encoder, $E_G$, preserves the anatomy of the input X-ray image in the latent space, which can be manipulated to provide spatial control. The bottom row has a sample pathology mask $m_p$ for pneumonia. The pathology controller combines this $m_p$ with latents of the X-ray image to add a specific pathology $p$.
  • Figure 4: Images with unrealistic anatomical structures (i.e., heart at the wrong location) generated during one of the experiments can achieve a low FID score of $\sim$30. This supports our claim that the FID score does not provide any information about image realism.
  • Figure 5: In each row, we show a sample X-ray image with an existing pathology (Left), where we use XReal to remove the pathology (Center) and then add a different pathology (Right).
  • ...and 4 more figures