MARIO: A Mixed Annotation Framework For Polyp Segmentation

Haoyang Li; Yiwen Hu; Jun Wei; Zhen Li

MARIO: A Mixed Annotation Framework For Polyp Segmentation

Haoyang Li, Yiwen Hu, Jun Wei, Zhen Li

TL;DR

MARIO addresses data scarcity in polyp segmentation by introducing mixed supervision to jointly learn from pixel-, polygon-, box-, scribble-, and point-level annotations within a transformer-based framework. It defines dedicated losses for dense ($L_{BCE}$, $L_{Dice}$), box (mask-to-box, $M2B$), scribble ($L_{Uncertain}$), and point ($L_{points}$) supervision, and combines them into a total objective, enabling training from heterogeneous data sources. Across eight datasets, MARIO achieves state-of-the-art performance, with a weighted-average Dice of $85.8$ and IoU of $78.3$, outperforming fully-supervised baselines. This mixed-supervision approach expands usable data, reduces labeling burdens, and supports practical clinical deployment for colorectal polyp screening.

Abstract

Existing polyp segmentation models are limited by high labeling costs and the small size of datasets. Additionally, vast polyp datasets remain underutilized because these models typically rely on a single type of annotation. To address this dilemma, we introduce MARIO, a mixed supervision model designed to accommodate various annotation types, significantly expanding the range of usable data. MARIO learns from underutilized datasets by incorporating five forms of supervision: pixel-level, box-level, polygon-level, scribblelevel, and point-level. Each form of supervision is associated with a tailored loss that effectively leverages the supervision labels while minimizing the noise. This allows MARIO to move beyond the constraints of relying on a single annotation type. Furthermore, MARIO primarily utilizes dataset with weak and cheap annotations, reducing the dependence on large-scale, fully annotated ones. Experimental results across five benchmark datasets demonstrate that MARIO consistently outperforms existing methods, highlighting its efficacy in balancing trade-offs between different forms of supervision and maximizing polyp segmentation performance

MARIO: A Mixed Annotation Framework For Polyp Segmentation

TL;DR

), box (mask-to-box,

), scribble (

), and point (

) supervision, and combines them into a total objective, enabling training from heterogeneous data sources. Across eight datasets, MARIO achieves state-of-the-art performance, with a weighted-average Dice of

and IoU of

, outperforming fully-supervised baselines. This mixed-supervision approach expands usable data, reduces labeling burdens, and supports practical clinical deployment for colorectal polyp screening.

Abstract

Paper Structure (10 sections, 2 equations, 2 figures, 2 tables)

This paper contains 10 sections, 2 equations, 2 figures, 2 tables.

Introduction
Methodology
Transformer-based Polyp Segmentation Model
Loss Function Definition
Experiment
Dataset and Implementation Detail
Performance Comparison
Ablation Study
Conclusion
COMPLIANCE WITH ETHICAL STANDARDS

Figures (2)

Figure 1: Illustration of our MARIO framework.
Figure 2: Visualization results of our MARIO and other comparison methods. "GT" denotes the ground truth.

MARIO: A Mixed Annotation Framework For Polyp Segmentation

TL;DR

Abstract

MARIO: A Mixed Annotation Framework For Polyp Segmentation

Authors

TL;DR

Abstract

Table of Contents

Figures (2)