Table of Contents
Fetching ...

PathoPainter: Augmenting Histopathology Segmentation via Tumor-aware Inpainting

Hong Liu, Haosen Yang, Evi M. C. Huijben, Mark Schuiveling, Ruisheng Su, Josien P. W. Pluim, Mitko Veta

TL;DR

PathoPainter tackles data scarcity in histopathology tumor segmentation by reframing synthetic data generation as tumor-aware inpainting conditioned on regional embeddings. It uses a latent diffusion model built on VQ-VAE latents and a self-supervised foreground embedding to produce accurate, mask-aligned tumor regions, with embedding sampling from other images to boost diversity. An adaptive uncertain-region filter removes regions likely to mislead segmentation training, improving robustness. Across DCIS, CATCH, and CAMELYON16, PathoPainter consistently improves segmentation IoU when synthetic data are added and outperforms prior methods, demonstrating practical impact for histopathology data augmentation and segmentation learning.

Abstract

Tumor segmentation plays a critical role in histopathology, but it requires costly, fine-grained image-mask pairs annotated by pathologists. Thus, synthesizing histopathology data to expand the dataset is highly desirable. Previous works suffer from inaccuracies and limited diversity in image-mask pairs, both of which affect training segmentation, particularly in small-scale datasets and the inherently complex nature of histopathology images. To address this challenge, we propose PathoPainter, which reformulates image-mask pair generation as a tumor inpainting task. Specifically, our approach preserves the background while inpainting the tumor region, ensuring precise alignment between the generated image and its corresponding mask. To enhance dataset diversity while maintaining biological plausibility, we incorporate a sampling mechanism that conditions tumor inpainting on regional embeddings from a different image. Additionally, we introduce a filtering strategy to exclude uncertain synthetic regions, further improving the quality of the generated data. Our comprehensive evaluation spans multiple datasets featuring diverse tumor types and various training data scales. As a result, segmentation improved significantly with our synthetic data, surpassing existing segmentation data synthesis approaches, e.g., 75.69% -> 77.69% on CAMELYON16. The code is available at https://github.com/HongLiuuuuu/PathoPainter.

PathoPainter: Augmenting Histopathology Segmentation via Tumor-aware Inpainting

TL;DR

PathoPainter tackles data scarcity in histopathology tumor segmentation by reframing synthetic data generation as tumor-aware inpainting conditioned on regional embeddings. It uses a latent diffusion model built on VQ-VAE latents and a self-supervised foreground embedding to produce accurate, mask-aligned tumor regions, with embedding sampling from other images to boost diversity. An adaptive uncertain-region filter removes regions likely to mislead segmentation training, improving robustness. Across DCIS, CATCH, and CAMELYON16, PathoPainter consistently improves segmentation IoU when synthetic data are added and outperforms prior methods, demonstrating practical impact for histopathology data augmentation and segmentation learning.

Abstract

Tumor segmentation plays a critical role in histopathology, but it requires costly, fine-grained image-mask pairs annotated by pathologists. Thus, synthesizing histopathology data to expand the dataset is highly desirable. Previous works suffer from inaccuracies and limited diversity in image-mask pairs, both of which affect training segmentation, particularly in small-scale datasets and the inherently complex nature of histopathology images. To address this challenge, we propose PathoPainter, which reformulates image-mask pair generation as a tumor inpainting task. Specifically, our approach preserves the background while inpainting the tumor region, ensuring precise alignment between the generated image and its corresponding mask. To enhance dataset diversity while maintaining biological plausibility, we incorporate a sampling mechanism that conditions tumor inpainting on regional embeddings from a different image. Additionally, we introduce a filtering strategy to exclude uncertain synthetic regions, further improving the quality of the generated data. Our comprehensive evaluation spans multiple datasets featuring diverse tumor types and various training data scales. As a result, segmentation improved significantly with our synthetic data, surpassing existing segmentation data synthesis approaches, e.g., 75.69% -> 77.69% on CAMELYON16. The code is available at https://github.com/HongLiuuuuu/PathoPainter.

Paper Structure

This paper contains 14 sections, 2 equations, 3 figures, 2 tables.

Figures (3)

  • Figure 1: Illustration of various synthetic methods. The columns represent: (a) the original image $x$, (b) the ground truth mask $m$, (c) DiffTumor qi, which inpaints the tumor region using the mask as a condition without any content conditioning, (d) STEDM style, which generates the entire image with style conditioning, and (e) PathoPainter (ours), which inpaints the tumor region using the mask and tumor embeddings from a different image as conditions.
  • Figure 2: Overview of the PathoPainter framework. During (a) training, the input image $x$ and the masked image $x \odot (1 - m)$ are encoded using a pretrained VQ-VAE vqvae to obtain their latent representations $z_0$ and $z_0^{bg}$. These representations, concatenated with the resized mask $z_m$, are fed into the diffusion model and conditioned on the embedding $v^{fg}$ extracted by SSP model, which captures features exclusively from the foreground. To augment the training set for downstream segmentation tasks, we generate reliable image-mask pairs using the method in (b). We randomly select a foreground embedding $v^{fg'}$ from the same tumor type as the target image. The U-Net denoiser then generates a new image-mask pair that contains tumors with characteristics distinct from those in the original image.
  • Figure 3: Visualization of generated images corresponding to the methods in Figure \ref{['fig:1']}. The columns represent: (a) the original image $x$, (b) the ground truth mask $m$, and synthetic images generated by (c) DiffTumor qi, (d) STEDM style, and (e) PathoPainter.