PathSegDiff: Pathology Segmentation using Diffusion model representations
Sachin Kumar Danisetty, Alexandros Graikos, Srikar Yellapragada, Dimitris Samaras
TL;DR
The paper addresses semantic segmentation in histopathology, where dense annotations are costly and tissue morphology is highly variable. It introduces PathSegDiff, which uses a domain-specific Latent Diffusion Model pre-trained on pathology data and conditioned by a self-supervised encoder (HIPT) to extract rich per-pixel features, followed by a lightweight FCN head for segmentation. PathSegDiff achieves state-of-the-art or competitive results on BCSS and GlaS, outperforming ImageNet-pretrained baselines and demonstrating the value of domain-specific diffusion representations for precise gland- and tissue-level segmentation. Ablation studies reveal optimal diffusion timesteps and feature-layer contributions, and a patch-based fusion strategy enables effective processing of large WSIs, supporting practical deployment in computational pathology.
Abstract
Image segmentation is crucial in many computational pathology pipelines, including accurate disease diagnosis, subtyping, outcome, and survivability prediction. The common approach for training a segmentation model relies on a pre-trained feature extractor and a dataset of paired image and mask annotations. These are used to train a lightweight prediction model that translates features into per-pixel classes. The choice of the feature extractor is central to the performance of the final segmentation model, and recent literature has focused on finding tasks to pre-train the feature extractor. In this paper, we propose PathSegDiff, a novel approach for histopathology image segmentation that leverages Latent Diffusion Models (LDMs) as pre-trained featured extractors. Our method utilizes a pathology-specific LDM, guided by a self-supervised encoder, to extract rich semantic information from H\&E stained histopathology images. We employ a simple, fully convolutional network to process the features extracted from the LDM and generate segmentation masks. Our experiments demonstrate significant improvements over traditional methods on the BCSS and GlaS datasets, highlighting the effectiveness of domain-specific diffusion pre-training in capturing intricate tissue structures and enhancing segmentation accuracy in histopathology images.
