Table of Contents
Fetching ...

DISC: Latent Diffusion Models with Self-Distillation from Separated Conditions for Prostate Cancer Grading

Man M. Ho, Elham Ghelichkhan, Yosep Chong, Yufei Zhou, Beatrice Knudsen, Tolga Tasdizen

TL;DR

This work tackles data scarcity in prostate cancer grading by using Latent Diffusion Models (LDMs) conditioned on pixel-level Gleason Grade (GG) masks to synthesize histopathology tiles containing multiple GGs. The authors introduce Self-Distillation from Separated Conditions (DISC), which splits complex GG-guided masks into separate latent features and distills them back into a single conditioned denoising process, enabling accurate generation of GG admixtures at tile granularity. They implement four models (SD, SD-SC, SD-DISC, SD-DISC-CoTrain) and a mask-sampling strategy, then demonstrate that training pixel-level (CarcinoNet) and slide-level (TransMIL) graders with synthetic tiles improves performance on SICAPv2 and generalizes to PANDA, with notable gains for rare GG5. The approach shows that generative augmentation with DISC can enhance grading accuracy when data are limited, potentially impacting clinical decision support for prostate cancer evaluation. Key components include the LDM loss $L_{LDM}$ and the DISC loss $L_{DISC}$, guiding high-fidelity, label-consistent tile synthesis.

Abstract

Latent Diffusion Models (LDMs) can generate high-fidelity images from noise, offering a promising approach for augmenting histopathology images for training cancer grading models. While previous works successfully generated high-fidelity histopathology images using LDMs, the generation of image tiles to improve prostate cancer grading has not yet been explored. Additionally, LDMs face challenges in accurately generating admixtures of multiple cancer grades in a tile when conditioned by a tile mask. In this study, we train specific LDMs to generate synthetic tiles that contain multiple Gleason Grades (GGs) by leveraging pixel-wise annotations in input tiles. We introduce a novel framework named Self-Distillation from Separated Conditions (DISC) that generates GG patterns guided by GG masks. Finally, we deploy a training framework for pixel-level and slide-level prostate cancer grading, where synthetic tiles are effectively utilized to improve the cancer grading performance of existing models. As a result, this work surpasses previous works in two domains: 1) our LDMs enhanced with DISC produce more accurate tiles in terms of GG patterns, and 2) our training scheme, incorporating synthetic data, significantly improves the generalization of the baseline model for prostate cancer grading, particularly in challenging cases of rare GG5, demonstrating the potential of generative models to enhance cancer grading when data is limited.

DISC: Latent Diffusion Models with Self-Distillation from Separated Conditions for Prostate Cancer Grading

TL;DR

This work tackles data scarcity in prostate cancer grading by using Latent Diffusion Models (LDMs) conditioned on pixel-level Gleason Grade (GG) masks to synthesize histopathology tiles containing multiple GGs. The authors introduce Self-Distillation from Separated Conditions (DISC), which splits complex GG-guided masks into separate latent features and distills them back into a single conditioned denoising process, enabling accurate generation of GG admixtures at tile granularity. They implement four models (SD, SD-SC, SD-DISC, SD-DISC-CoTrain) and a mask-sampling strategy, then demonstrate that training pixel-level (CarcinoNet) and slide-level (TransMIL) graders with synthetic tiles improves performance on SICAPv2 and generalizes to PANDA, with notable gains for rare GG5. The approach shows that generative augmentation with DISC can enhance grading accuracy when data are limited, potentially impacting clinical decision support for prostate cancer evaluation. Key components include the LDM loss and the DISC loss , guiding high-fidelity, label-consistent tile synthesis.

Abstract

Latent Diffusion Models (LDMs) can generate high-fidelity images from noise, offering a promising approach for augmenting histopathology images for training cancer grading models. While previous works successfully generated high-fidelity histopathology images using LDMs, the generation of image tiles to improve prostate cancer grading has not yet been explored. Additionally, LDMs face challenges in accurately generating admixtures of multiple cancer grades in a tile when conditioned by a tile mask. In this study, we train specific LDMs to generate synthetic tiles that contain multiple Gleason Grades (GGs) by leveraging pixel-wise annotations in input tiles. We introduce a novel framework named Self-Distillation from Separated Conditions (DISC) that generates GG patterns guided by GG masks. Finally, we deploy a training framework for pixel-level and slide-level prostate cancer grading, where synthetic tiles are effectively utilized to improve the cancer grading performance of existing models. As a result, this work surpasses previous works in two domains: 1) our LDMs enhanced with DISC produce more accurate tiles in terms of GG patterns, and 2) our training scheme, incorporating synthetic data, significantly improves the generalization of the baseline model for prostate cancer grading, particularly in challenging cases of rare GG5, demonstrating the potential of generative models to enhance cancer grading when data is limited.
Paper Structure (10 sections, 4 equations, 8 figures)

This paper contains 10 sections, 4 equations, 8 figures.

Figures (8)

  • Figure 1: Stable Diffusion rombach2022high produce a sheet of cells resembling GG5 in GG3-indicated regions (top) and fused glands resembling GG4 in Non-Cancer-indicated regions (bottom).
  • Figure 2: Besides the real patches (top-left) for training pixel-level and slide-level Gleason grading models (right), we introduce Latent Diffusion Models (LDMs) rombach2022high with Self-Distillation from Separated Conditions (DISC) to accurately generate admixtures of multiple Gleason Grades in a tile when conditioned by a tile mask (bottom-left).
  • Figure 3: Latent Diffusion Models rombach2022high conditioned by guided masks with multiple Gleason Grades (GGs)
  • Figure 4: We introduce Self-Distillation from Separated Conditions (DISC) to improve image synthesis accuracy. Instead of using the initial complex guided mask with multiple Gleason Grades (GGs) (top), we generate separate latent features with distinct labels, which are fused with the mask in the final step for robust patterns. However, this approach incurs a computational cost of $\times K$, the number of labels. To address this, we train the main process to distill information from fused latent features obtained from the Condition-Separated Denoising Process (bottom).
  • Figure 5: A qualitative comparison between Stable Diffusion (SD) rombach2022high and our proposed technique, SD with Self-Distillation from Separated Conditions (DISC) (discussed in Section \ref{['sec:disc']}), for histopathology image synthesis. This work yields higher-confidence label patterns compared to SD. Notably, SD tends to generate fused glands representing GG4 for Non-Cancer regions (highlighted rectangles) and sheets of cells representing GG5 for GG3-indicated regions(indicated by yellow arrows). Labels: Non-Cancer, GG3, GG4, GG5.
  • ...and 3 more figures