Table of Contents
Fetching ...

Polyp-E: Benchmarking the Robustness of Deep Segmentation Models via Polyp Editing

Runpu Wei, Zijin Yin, Kongming Liang, Min Min, Chengwei Pan, Gang Yu, Haonan Huang, Yan Liu, Zhanyu Ma

TL;DR

This work performs attribute editing on real polyps and builds a new dataset named Polyp-E, which boasts exceptional realism, to the extent that clinical experts find it challenging to discern them from real data.

Abstract

Automatic polyp segmentation is helpful to assist clinical diagnosis and treatment. In daily clinical practice, clinicians exhibit robustness in identifying polyps with both location and size variations. It is uncertain if deep segmentation models can achieve comparable robustness in automated colonoscopic analysis. To benchmark the model robustness, we focus on evaluating the robustness of segmentation models on the polyps with various attributes (e.g. location and size) and healthy samples. Based on the Latent Diffusion Model, we perform attribute editing on real polyps and build a new dataset named Polyp-E. Our synthetic dataset boasts exceptional realism, to the extent that clinical experts find it challenging to discern them from real data. We evaluate several existing polyp segmentation models on the proposed benchmark. The results reveal most of the models are highly sensitive to attribute variations. As a novel data augmentation technique, the proposed editing pipeline can improve both in-distribution and out-of-distribution generalization ability. The code and datasets will be released.

Polyp-E: Benchmarking the Robustness of Deep Segmentation Models via Polyp Editing

TL;DR

This work performs attribute editing on real polyps and builds a new dataset named Polyp-E, which boasts exceptional realism, to the extent that clinical experts find it challenging to discern them from real data.

Abstract

Automatic polyp segmentation is helpful to assist clinical diagnosis and treatment. In daily clinical practice, clinicians exhibit robustness in identifying polyps with both location and size variations. It is uncertain if deep segmentation models can achieve comparable robustness in automated colonoscopic analysis. To benchmark the model robustness, we focus on evaluating the robustness of segmentation models on the polyps with various attributes (e.g. location and size) and healthy samples. Based on the Latent Diffusion Model, we perform attribute editing on real polyps and build a new dataset named Polyp-E. Our synthetic dataset boasts exceptional realism, to the extent that clinical experts find it challenging to discern them from real data. We evaluate several existing polyp segmentation models on the proposed benchmark. The results reveal most of the models are highly sensitive to attribute variations. As a novel data augmentation technique, the proposed editing pipeline can improve both in-distribution and out-of-distribution generalization ability. The code and datasets will be released.

Paper Structure

This paper contains 16 sections, 3 equations, 7 figures, 3 tables.

Figures (7)

  • Figure 1: An overview of our study. We generate healthy, size and position variations utilizing real polyp images, and then evaluate the robustness of various state-of-the-art segmentation models including traditional models and foundation models.
  • Figure 2: The illustration of our polyp editing pipeline. (1) We disentangle the background and polyp, then in-paint the missing part of the background. This process yields a healthy tissue sample that corresponds to the original polyp image, effectively separating the pathological area from its surroundings while maintaining the context. (2) We manipulate real images through the precise adjustment of polyp size and position, thereby customizing their appearance and layout. (3) We refine boundaries utilizing in-painting to generate more faithful samples.
  • Figure 3: The visualization of the random mask generation strategy of the recovering background part. Each sub-figure (Left: Original image. Right: Corresponding masked image with random mask.) is an example of a pair of images used for training polyp to non-polyp image transformation.
  • Figure 4: Example of negative images inpainted by different masks in inference phase. For each subfigure, from left to right are the original polyp image, the negative image in-painted with the original polyp mask and the negative image in-painted with the 20-pixel dilated polyp mask.
  • Figure 5: Qualitative comparison of images generated by ArSDM, ImageNet-E, and the proposed method. From top to bottom corresponding to the healthy image, random position changes within 20% range, and random size changes within 20% range.
  • ...and 2 more figures