PixCell: A generative foundation model for digital histopathology images
Srikar Yellapragada, Alexandros Graikos, Zilinghan Li, Kostas Triaridis, Varun Belagali, Tarak Nath Nandi, Karen Bai, Beatrice S. Knudsen, Tahsin Kurc, Rajarsi R. Gupta, Prateek Prasanna, Ravi K Madduri, Joel Saltz, Dimitris Samaras
TL;DR
<3-5 sentence high-level summary> PixCell tackles the core challenges of histopathology data scarcity and privacy by introducing a diffusion-based generative foundation model trained on PanCan-30M, with progressive, SSL-embedding-conditioned training to synthesize high-quality, semantically faithful H&E patches. The model enables data augmentation that boosts downstream classification, supports privacy-preserving data sharing through synthetic data, and extends to zero-shot virtual staining (H&E→IHC) via embedding translation and lightweight adapters. Across extensive evaluations, PixCell delivers superior image realism (low Fréchet distances across pathology encoders), preserves tissue semantics, and achieves diagnostically relevant performance on synthetic images, including BRCA subtyping accuracy. The work also demonstrates practical applications such as synthetic SSL pretraining, synthetic-data pooling for multi-institution learning, and an open-release of synthetic data and model weights to accelerate computational pathology research.
Abstract
The digitization of histology slides has revolutionized pathology, providing massive datasets for cancer diagnosis and research. Self-supervised and vision-language models have been shown to effectively mine large pathology datasets to learn discriminative representations. On the other hand, there are unique problems in pathology, such as annotated data scarcity, privacy regulations in data sharing, and inherently generative tasks like virtual staining. Generative models, capable of synthesizing realistic and diverse images, present a compelling solution to address these problems through image synthesis. We introduce PixCell, the first generative foundation model for histopathology images. PixCell is a diffusion model trained on PanCan-30M, a large, diverse dataset derived from 69,184 H&E-stained whole slide images of various cancer types. We employ a progressive training strategy and a self-supervision-based conditioning that allows us to scale up training without any human-annotated data. By conditioning on real slides, the synthetic images capture the properties of the real data and can be used as data augmentation for small-scale datasets to boost classification performance. We prove the foundational versatility of PixCell by applying it to two generative downstream tasks: privacy-preserving synthetic data generation and virtual IHC staining. PixCell's high-fidelity conditional generation enables institutions to use their private data to synthesize highly realistic, site-specific surrogate images that can be shared in place of raw patient data. Furthermore, using datasets of roughly paired H&E-IHC tiles, we learn to translate PixCell's conditioning from H&E to multiple IHC stains, allowing the generation of IHC images from H&E inputs. Our trained models are publicly released to accelerate research in computational pathology.
