Table of Contents
Fetching ...

LeFusion: Controllable Pathology Synthesis via Lesion-Focused Diffusion Models

Hantao Zhang, Yuhe Liu, Jiancheng Yang, Shouhong Wan, Xinyuan Wang, Wei Peng, Pascal Fua

TL;DR

LeFusion tackles data scarcity and long-tail pathology biases in medical imaging by generating lesion-containing image–segmentation pairs from lesion-free scans. It reframes diffusion learning to focus on the lesion region by combining forward-diffused background contexts with reverse-diffused foregrounds and applying a lesion-focused loss, thereby preserving backgrounds while synthesizing realistic lesions. To handle complex pathology, it introduces histogram-based texture control for multi-peak lesions, multi-channel decomposition for multi-class lesions, and DiffMask to diversify lesion masks with controllable size and location. Validated on 3D lung nodule CT and cardiac lesion MRI, LeFusion and its variants improve downstream segmentation performance for state-of-the-art models such as nnUNet and SwinUNETR, demonstrating practical utility for data augmentation in medical imaging. The work provides code and models to enable broader adoption and further research in lesion-focused diffusion synthesis.

Abstract

Patient data from real-world clinical practice often suffers from data scarcity and long-tail imbalances, leading to biased outcomes or algorithmic unfairness. This study addresses these challenges by generating lesion-containing image-segmentation pairs from lesion-free images. Previous efforts in medical imaging synthesis have struggled with separating lesion information from background, resulting in low-quality backgrounds and limited control over the synthetic output. Inspired by diffusion-based image inpainting, we propose LeFusion, a lesion-focused diffusion model. By redesigning the diffusion learning objectives to focus on lesion areas, we simplify the learning process and improve control over the output while preserving high-fidelity backgrounds by integrating forward-diffused background contexts into the reverse diffusion process. Additionally, we tackle two major challenges in lesion texture synthesis: 1) multi-peak and 2) multi-class lesions. We introduce two effective strategies: histogram-based texture control and multi-channel decomposition, enabling the controlled generation of high-quality lesions in difficult scenarios. Furthermore, we incorporate lesion mask diffusion, allowing control over lesion size, location, and boundary, thus increasing lesion diversity. Validated on 3D cardiac lesion MRI and lung nodule CT datasets, LeFusion-generated data significantly improves the performance of state-of-the-art segmentation models, including nnUNet and SwinUNETR. Code and model are available at https://github.com/M3DV/LeFusion.

LeFusion: Controllable Pathology Synthesis via Lesion-Focused Diffusion Models

TL;DR

LeFusion tackles data scarcity and long-tail pathology biases in medical imaging by generating lesion-containing image–segmentation pairs from lesion-free scans. It reframes diffusion learning to focus on the lesion region by combining forward-diffused background contexts with reverse-diffused foregrounds and applying a lesion-focused loss, thereby preserving backgrounds while synthesizing realistic lesions. To handle complex pathology, it introduces histogram-based texture control for multi-peak lesions, multi-channel decomposition for multi-class lesions, and DiffMask to diversify lesion masks with controllable size and location. Validated on 3D lung nodule CT and cardiac lesion MRI, LeFusion and its variants improve downstream segmentation performance for state-of-the-art models such as nnUNet and SwinUNETR, demonstrating practical utility for data augmentation in medical imaging. The work provides code and models to enable broader adoption and further research in lesion-focused diffusion synthesis.

Abstract

Patient data from real-world clinical practice often suffers from data scarcity and long-tail imbalances, leading to biased outcomes or algorithmic unfairness. This study addresses these challenges by generating lesion-containing image-segmentation pairs from lesion-free images. Previous efforts in medical imaging synthesis have struggled with separating lesion information from background, resulting in low-quality backgrounds and limited control over the synthetic output. Inspired by diffusion-based image inpainting, we propose LeFusion, a lesion-focused diffusion model. By redesigning the diffusion learning objectives to focus on lesion areas, we simplify the learning process and improve control over the output while preserving high-fidelity backgrounds by integrating forward-diffused background contexts into the reverse diffusion process. Additionally, we tackle two major challenges in lesion texture synthesis: 1) multi-peak and 2) multi-class lesions. We introduce two effective strategies: histogram-based texture control and multi-channel decomposition, enabling the controlled generation of high-quality lesions in difficult scenarios. Furthermore, we incorporate lesion mask diffusion, allowing control over lesion size, location, and boundary, thus increasing lesion diversity. Validated on 3D cardiac lesion MRI and lung nodule CT datasets, LeFusion-generated data significantly improves the performance of state-of-the-art segmentation models, including nnUNet and SwinUNETR. Code and model are available at https://github.com/M3DV/LeFusion.
Paper Structure (30 sections, 6 equations, 14 figures, 4 tables)

This paper contains 30 sections, 6 equations, 14 figures, 4 tables.

Figures (14)

  • Figure 1: Standard Conditional Diffusion vs. Lesion-Focused Diffusion (LeFusion). (a) Standard Conditional Diffusion concatenates background, lesion mask, and noise. The model generates both lesion and background, risking background integrity and wasting capacity on difficult but unnecessary background generation, especially in data-limited settings. (b) Lesion-Focused Diffusion (LeFusion) uses forward-diffused backgrounds and reverse-diffused foregrounds as input. The model reconstructs only the lesion, ensuring realistic background preservation and simplifying the task. (c) LeFusion with Fine Control of Lesion Textures and Masks introduces histogram-based texture control for multi-peak lesions, multi-channel decomposition for multi-class lesions, and lesion mask diffusion for control over size, location and boundary, enhancing lesion quality and diversity.
  • Figure 2: LeFusion: Lesion-Focused Diffusion Model. The top illustrates the training process of LeFusion, while the bottom shows the inference. During training, LeFusion avoids learning unnecessary background generation using a lesion-focused loss. In inference, by combining forward-diffused real backgrounds with reverse-diffused generated foregrounds, LeFusion ensures high-quality background generation. Additionally, we introduce histogram-based texture control to handle multi-peak lesions and multi-channel decomposition for multi-class lesions.
  • Figure 3: Illustration of Lung Nodule Texture Histogram Distribution. Samples are clustered into three groups based on the grayscale image histogram of lesions. The visualized differences between groups are significant, indicating a typical multi-peak distribution. These clusters roughly correspond to ground-glass, part-solid, and solid nodules.
  • Figure 4: DiffMask: Lesion Mask Diffusion. To achieve fine control over lesion size, location, and boundary, we propose two key designs: the boundary mask and the control sphere. The boundary mask removes areas outside the boundary at each diffusion step. The control sphere, trained using the bounding spheres of real masks, enables control over size and location during inference.
  • Figure 5: Visualization of Synthetic Image on Emidec lalande2022deep and LIDC armato2011lung. We compare the differences in image similarity between synthetic pathological cases generated by different methods, using real pathological cases and using normal regions. More visualizations can be found in Appendix \ref{['pg:vis']}
  • ...and 9 more figures