Table of Contents
Fetching ...

CytoSyn: a Foundation Diffusion Model for Histopathology -- Tech Report

Thomas Duboudin, Xavier Fontaine, Etienne Andrier, Lionel Guillou, Alexandre Filiot, Thalyssa Baiocco-Rodrigues, Antoine Olivier, Alberto Romagnoni, John Klein, Jean-Baptiste Schiratti

Abstract

Computational pathology has made significant progress in recent years, fueling advances in both fundamental disease understanding and clinically ready tools. This evolution is driven by the availability of large amounts of digitized slides and specialized deep learning methods and models. Multiple self-supervised foundation feature extractors have been developed, enabling downstream predictive applications from cell segmentation to tumor sub-typing and survival analysis. In contrast, generative foundation models designed specifically for histopathology remain scarce. Such models could address tasks that are beyond the capabilities of feature extractors, such as virtual staining. In this paper, we introduce CytoSyn, a state-of-the-art foundation latent diffusion model that enables the guided generation of highly realistic and diverse histopathology H&E-stained images, as shown in an extensive benchmark. We explored methodological improvements, training set scaling, sampling strategies and slide-level overfitting, culminating in the improved CytoSyn-v2, and compared our work to PixCell, a state-of-the-art model, in an in-depth manner. This comparison highlighted the strong sensitivity of both diffusion models and performance metrics to preprocessing-specific details such as JPEG compression. Our model has been trained on a dataset obtained from more than 10,000 TCGA diagnostic whole-slide images of 32 different cancer types. Despite being trained only on oncology slides, it maintains state-of-the-art performance generating inflammatory bowel disease images. To support the research community, we publicly release CytoSyn's weights, its training and validation datasets, and a sample of synthetic images in this repository: https://huggingface.co/Owkin-Bioptimus/CytoSyn.

CytoSyn: a Foundation Diffusion Model for Histopathology -- Tech Report

Abstract

Computational pathology has made significant progress in recent years, fueling advances in both fundamental disease understanding and clinically ready tools. This evolution is driven by the availability of large amounts of digitized slides and specialized deep learning methods and models. Multiple self-supervised foundation feature extractors have been developed, enabling downstream predictive applications from cell segmentation to tumor sub-typing and survival analysis. In contrast, generative foundation models designed specifically for histopathology remain scarce. Such models could address tasks that are beyond the capabilities of feature extractors, such as virtual staining. In this paper, we introduce CytoSyn, a state-of-the-art foundation latent diffusion model that enables the guided generation of highly realistic and diverse histopathology H&E-stained images, as shown in an extensive benchmark. We explored methodological improvements, training set scaling, sampling strategies and slide-level overfitting, culminating in the improved CytoSyn-v2, and compared our work to PixCell, a state-of-the-art model, in an in-depth manner. This comparison highlighted the strong sensitivity of both diffusion models and performance metrics to preprocessing-specific details such as JPEG compression. Our model has been trained on a dataset obtained from more than 10,000 TCGA diagnostic whole-slide images of 32 different cancer types. Despite being trained only on oncology slides, it maintains state-of-the-art performance generating inflammatory bowel disease images. To support the research community, we publicly release CytoSyn's weights, its training and validation datasets, and a sample of synthetic images in this repository: https://huggingface.co/Owkin-Bioptimus/CytoSyn.
Paper Structure (15 sections, 5 figures, 5 tables)

This paper contains 15 sections, 5 figures, 5 tables.

Figures (5)

  • Figure 1: Examples of tiles generated unconditionally with CytoSyn.
  • Figure 2: H0-mini conditioning enables the generation of visually distinct yet biologically highly consistent tiles. Each row shows one reference image (left) and five generated variations.
  • Figure 3: Feature-based conditioning allows a fine-grained control on the semantic of the synthesized images, a prerequisite to use synthetic images as data augmentation, while maintaining highly realistic outputs as illustrated in this figure with a linear interpolation example. Left and right columns: original tiles, center columns: synthetic tiles obtained using a linear interpolation of left and right tiles' features (with interpolation factor $0.2, 0.4, 0.6, 0.8$).
  • Figure 4: Unconditional image generation performance of CytoSyn (40M model) across different feature extractors, number of sampling steps, sampling methods and validation sets ($y$-axis: Fréchet distance, $x$-axis: number of sampling steps). The inset box in each plot provides a magnified view of the values obtained with 250 sampling steps.
  • Figure 5: Overview of our all-in-one validation pipeline.