Table of Contents
Fetching ...

CT Synthesis with Conditional Diffusion Models for Abdominal Lymph Node Segmentation

Yongrui Yu, Hanyu Chen, Zitian Zhang, Qiong Xiao, Wenhui Lei, Linrui Dai, Yu Fu, Hui Tan, Guan Wang, Peng Gao, Xiaofan Zhang

TL;DR

This work tackles the difficulty of abdominal lymph node segmentation under limited annotated data by introducing LN-DDPM, a conditional diffusion model that generates paired lymph node images and masks conditioned on lymph node and anatomical structure masks. It employs a dual conditioning scheme: global structure conditioning via channel-wise concatenation and local detail conditioning via cross-attention to faithfully reproduce LN characteristics and their abdominal surroundings. The generated data, used alongside real data to train nnU-Net, improves segmentation performance and demonstrates competitive learning even when trained exclusively on synthetic data. Ablation studies confirm the positive impact of conditioning signals, sampling variations, and attention modes on generation fidelity and downstream segmentation metrics.

Abstract

Despite the significant success achieved by deep learning methods in medical image segmentation, researchers still struggle in the computer-aided diagnosis of abdominal lymph nodes due to the complex abdominal environment, small and indistinguishable lesions, and limited annotated data. To address these problems, we present a pipeline that integrates the conditional diffusion model for lymph node generation and the nnU-Net model for lymph node segmentation to improve the segmentation performance of abdominal lymph nodes through synthesizing a diversity of realistic abdominal lymph node data. We propose LN-DDPM, a conditional denoising diffusion probabilistic model (DDPM) for lymph node (LN) generation. LN-DDPM utilizes lymph node masks and anatomical structure masks as model conditions. These conditions work in two conditioning mechanisms: global structure conditioning and local detail conditioning, to distinguish between lymph nodes and their surroundings and better capture lymph node characteristics. The obtained paired abdominal lymph node images and masks are used for the downstream segmentation task. Experimental results on the abdominal lymph node datasets demonstrate that LN-DDPM outperforms other generative methods in the abdominal lymph node image synthesis and better assists the downstream abdominal lymph node segmentation task.

CT Synthesis with Conditional Diffusion Models for Abdominal Lymph Node Segmentation

TL;DR

This work tackles the difficulty of abdominal lymph node segmentation under limited annotated data by introducing LN-DDPM, a conditional diffusion model that generates paired lymph node images and masks conditioned on lymph node and anatomical structure masks. It employs a dual conditioning scheme: global structure conditioning via channel-wise concatenation and local detail conditioning via cross-attention to faithfully reproduce LN characteristics and their abdominal surroundings. The generated data, used alongside real data to train nnU-Net, improves segmentation performance and demonstrates competitive learning even when trained exclusively on synthetic data. Ablation studies confirm the positive impact of conditioning signals, sampling variations, and attention modes on generation fidelity and downstream segmentation metrics.

Abstract

Despite the significant success achieved by deep learning methods in medical image segmentation, researchers still struggle in the computer-aided diagnosis of abdominal lymph nodes due to the complex abdominal environment, small and indistinguishable lesions, and limited annotated data. To address these problems, we present a pipeline that integrates the conditional diffusion model for lymph node generation and the nnU-Net model for lymph node segmentation to improve the segmentation performance of abdominal lymph nodes through synthesizing a diversity of realistic abdominal lymph node data. We propose LN-DDPM, a conditional denoising diffusion probabilistic model (DDPM) for lymph node (LN) generation. LN-DDPM utilizes lymph node masks and anatomical structure masks as model conditions. These conditions work in two conditioning mechanisms: global structure conditioning and local detail conditioning, to distinguish between lymph nodes and their surroundings and better capture lymph node characteristics. The obtained paired abdominal lymph node images and masks are used for the downstream segmentation task. Experimental results on the abdominal lymph node datasets demonstrate that LN-DDPM outperforms other generative methods in the abdominal lymph node image synthesis and better assists the downstream abdominal lymph node segmentation task.
Paper Structure (37 sections, 8 equations, 5 figures, 3 tables)

This paper contains 37 sections, 8 equations, 5 figures, 3 tables.

Figures (5)

  • Figure 1: Overview of our proposed pipeline containing LN-DDPM for abdominal lymph node generation and nnU-Net for abdominal lymph node segmentation. The LN-DDPM is conditioned on $\mathbf{c}$, including anatomical structure mask $\mathbf{c}_a$ and lymph node mask $\mathbf{c}_m$, at each denoising step $t$. During training, we compute the loss between the actual added noise $\boldsymbol{\epsilon}$ and the denoising network predicted noise $\boldsymbol{\hat{\epsilon}}$. During sampling, we augment the original condition $\mathbf{c}$ to the transformed condition $\mathbf{c}^{\prime}$, then generate paired abdominal lymph node image and mask for training the downstream segmentation model. The nnU-Net model is trained using both real and synthetic data and tested on real abdominal lymph node images.
  • Figure 2: Preparation of conditioning signals. We focus on the lymph node region instead of the entire abdominal cavity. The anatomical structure mask is acquired from TotalSegmentator. The original condition $\mathbf{c}$ is used for diffusion model training and the transformed condition $\mathbf{c}^{\prime}$ is used for diffusion model sampling so as to enhance the conditional diversity of lymph node masks while preserving anatomical structures.
  • Figure 3: The architecture of the denoising network. The denoising network employs the U-Net architecture. GSCond and LDCond denote global structure conditioning mechanism via channel-wise concatenation and local detail conditioning mechanism via cross-attention, respectively. The mask encoder is used to encode lymph node mask.
  • Figure 4: Visualization of generation results. The first column and the second column list sampling conditions for LN-DDPM to synthesize abdominal lymph node images. The third column lists real abdominal lymph node images. Methods conditioned on anatomical structures generate more realistic abdominal lymph node images. Besides, LN-DDPM generates lymph node images with more clear boundaries.
  • Figure 5: Visualization of segmentation results when training exclusively with synthetic data. The green and red colors represent ground-truths and predictions of abdominal lymph nodes, respectively. LN-DDPM exhibits superior segmentation performance.