Table of Contents
Fetching ...

F2FLDM: Latent Diffusion Models with Histopathology Pre-Trained Embeddings for Unpaired Frozen Section to FFPE Translation

Man M. Ho, Shikha Dubey, Yosep Chong, Beatrice Knudsen, Tolga Tasdizen

TL;DR

This work tackles the challenge of translating rapid, artifact-prone Frozen Section (FS) histology images into high-quality FFPE representations by introducing a Latent Diffusion Model (LDM) framework conditioned on text and histopathology pre-trained embeddings. It integrates a GAN-based embedding translator to handle unpaired FFPE guidance and leverages Stable Diffusion XL with LoRA for efficient fine-tuning, using DDIM inversion and a denoising U-Net to preserve diagnostic features. The approach yields substantial gains in downstream classification, with AUC rising from $81.99\%$ to $94.64\%$ and favorable Case-wise Fréchet Distances (CaseFD) compared to AIFFPE, UVCGAN2, and CycleDiffusion, particularly on kidney subtype classification. Altogether, the method sets a new benchmark for FS→FFPE translation, improving reliability and accuracy in histopathology analysis during surgery and guiding future diffusion-based domain translations in pathology.

Abstract

The Frozen Section (FS) technique is a rapid and efficient method, taking only 15-30 minutes to prepare slides for pathologists' evaluation during surgery, enabling immediate decisions on further surgical interventions. However, FS process often introduces artifacts and distortions like folds and ice-crystal effects. In contrast, these artifacts and distortions are absent in the higher-quality formalin-fixed paraffin-embedded (FFPE) slides, which require 2-3 days to prepare. While Generative Adversarial Network (GAN)-based methods have been used to translate FS to FFPE images (F2F), they may leave morphological inaccuracies with remaining FS artifacts or introduce new artifacts, reducing the quality of these translations for clinical assessments. In this study, we benchmark recent generative models, focusing on GANs and Latent Diffusion Models (LDMs), to overcome these limitations. We introduce a novel approach that combines LDMs with Histopathology Pre-Trained Embeddings to enhance restoration of FS images. Our framework leverages LDMs conditioned by both text and pre-trained embeddings to learn meaningful features of FS and FFPE histopathology images. Through diffusion and denoising techniques, our approach not only preserves essential diagnostic attributes like color staining and tissue morphology but also proposes an embedding translation mechanism to better predict the targeted FFPE representation of input FS images. As a result, this work achieves a significant improvement in classification performance, with the Area Under the Curve rising from 81.99% to 94.64%, accompanied by an advantageous CaseFD. This work establishes a new benchmark for FS to FFPE image translation quality, promising enhanced reliability and accuracy in histopathology FS image analysis. Our work is available at https://minhmanho.github.io/f2f_ldm/.

F2FLDM: Latent Diffusion Models with Histopathology Pre-Trained Embeddings for Unpaired Frozen Section to FFPE Translation

TL;DR

This work tackles the challenge of translating rapid, artifact-prone Frozen Section (FS) histology images into high-quality FFPE representations by introducing a Latent Diffusion Model (LDM) framework conditioned on text and histopathology pre-trained embeddings. It integrates a GAN-based embedding translator to handle unpaired FFPE guidance and leverages Stable Diffusion XL with LoRA for efficient fine-tuning, using DDIM inversion and a denoising U-Net to preserve diagnostic features. The approach yields substantial gains in downstream classification, with AUC rising from to and favorable Case-wise Fréchet Distances (CaseFD) compared to AIFFPE, UVCGAN2, and CycleDiffusion, particularly on kidney subtype classification. Altogether, the method sets a new benchmark for FS→FFPE translation, improving reliability and accuracy in histopathology analysis during surgery and guiding future diffusion-based domain translations in pathology.

Abstract

The Frozen Section (FS) technique is a rapid and efficient method, taking only 15-30 minutes to prepare slides for pathologists' evaluation during surgery, enabling immediate decisions on further surgical interventions. However, FS process often introduces artifacts and distortions like folds and ice-crystal effects. In contrast, these artifacts and distortions are absent in the higher-quality formalin-fixed paraffin-embedded (FFPE) slides, which require 2-3 days to prepare. While Generative Adversarial Network (GAN)-based methods have been used to translate FS to FFPE images (F2F), they may leave morphological inaccuracies with remaining FS artifacts or introduce new artifacts, reducing the quality of these translations for clinical assessments. In this study, we benchmark recent generative models, focusing on GANs and Latent Diffusion Models (LDMs), to overcome these limitations. We introduce a novel approach that combines LDMs with Histopathology Pre-Trained Embeddings to enhance restoration of FS images. Our framework leverages LDMs conditioned by both text and pre-trained embeddings to learn meaningful features of FS and FFPE histopathology images. Through diffusion and denoising techniques, our approach not only preserves essential diagnostic attributes like color staining and tissue morphology but also proposes an embedding translation mechanism to better predict the targeted FFPE representation of input FS images. As a result, this work achieves a significant improvement in classification performance, with the Area Under the Curve rising from 81.99% to 94.64%, accompanied by an advantageous CaseFD. This work establishes a new benchmark for FS to FFPE image translation quality, promising enhanced reliability and accuracy in histopathology FS image analysis. Our work is available at https://minhmanho.github.io/f2f_ldm/.
Paper Structure (6 sections, 1 equation, 6 figures, 2 tables)

This paper contains 6 sections, 1 equation, 6 figures, 2 tables.

Figures (6)

  • Figure 1: Overview of FS and FFPE processes and our motivation.
  • Figure 2: Performance on CycleDiffusion-restored slides on kidney subtype classification, including Area Under the Curve (AUC) and accuracy, alongside case-wise Fréchet Distance in the HIPT-256 feature space (FD-HIPT256). The higher strength, the more added noise and denoising timesteps, the closer to FFPE domain.
  • Figure 3: Overview of our FS to FFPE image translation framework.
  • Figure 4: Ablation Studies on classifier-free Guidance Scale (GS), Strength, FS to FFPE Embedding Translation, LoRA rank, and L0 Regularization in Restoration of Artifacts in FS images, evaluated on downstream kidney subtype classfication in macro-averaged Area Under the Curve (AUC) and sample-wise Accuracy (Acc).
  • Figure 5: A qualitative comparison between AIFFPE aiffpe, UVCGAN2 torbunov2023uvcgan2, CycleDiffusion wu2022cyclediffusion, and ours. Unpaired FFPE images are from a different tissue.
  • ...and 1 more figures