Table of Contents
Fetching ...

Model Agnostic Preference Optimization for Medical Image Segmentation

Yunseong Nam, Jiwon Jang, Dongkyu Won, Sang Hyun Park, Soopil Kim

TL;DR

The paper tackles domain shifts and data scarcity in medical image segmentation by proposing MAPO, a model-agnostic, dropout-driven preference optimization framework. MAPO generates diverse predictions via dropout, creates preference pairs online, and optimizes with a Direct Preference Optimization objective combined with standard segmentation losses, ensuring stable training. Across multiple 2D and 3D datasets and architectures (CNNs, Transformers, hybrids), MAPO yields consistent Dice improvements and reduced boundary errors (ASD), while also stabilizing optimization on challenging datasets. The approach eliminates reliance on architecture-specific sampling techniques and demonstrates strong practical potential for robust, generalizable medical image segmentation.

Abstract

Preference optimization offers a scalable supervision paradigm based on relative preference signals, yet prior attempts in medical image segmentation remain model-specific and rely on low-diversity prediction sampling. In this paper, we propose MAPO (Model-Agnostic Preference Optimization), a training framework that utilizes Dropout-driven stochastic segmentation hypotheses to construct preference-consistent gradients without direct ground-truth supervision. MAPO is fully architecture- and dimensionality-agnostic, supporting 2D/3D CNN and Transformer-based segmentation pipelines. Comprehensive evaluations across diverse medical datasets reveal that MAPO consistently enhances boundary adherence, reduces overfitting, and yields more stable optimization dynamics compared to conventional supervised training.

Model Agnostic Preference Optimization for Medical Image Segmentation

TL;DR

The paper tackles domain shifts and data scarcity in medical image segmentation by proposing MAPO, a model-agnostic, dropout-driven preference optimization framework. MAPO generates diverse predictions via dropout, creates preference pairs online, and optimizes with a Direct Preference Optimization objective combined with standard segmentation losses, ensuring stable training. Across multiple 2D and 3D datasets and architectures (CNNs, Transformers, hybrids), MAPO yields consistent Dice improvements and reduced boundary errors (ASD), while also stabilizing optimization on challenging datasets. The approach eliminates reliance on architecture-specific sampling techniques and demonstrates strong practical potential for robust, generalizable medical image segmentation.

Abstract

Preference optimization offers a scalable supervision paradigm based on relative preference signals, yet prior attempts in medical image segmentation remain model-specific and rely on low-diversity prediction sampling. In this paper, we propose MAPO (Model-Agnostic Preference Optimization), a training framework that utilizes Dropout-driven stochastic segmentation hypotheses to construct preference-consistent gradients without direct ground-truth supervision. MAPO is fully architecture- and dimensionality-agnostic, supporting 2D/3D CNN and Transformer-based segmentation pipelines. Comprehensive evaluations across diverse medical datasets reveal that MAPO consistently enhances boundary adherence, reduces overfitting, and yields more stable optimization dynamics compared to conventional supervised training.

Paper Structure

This paper contains 19 sections, 7 equations, 4 figures, 6 tables.

Figures (4)

  • Figure 1: Overview of our proposed model-agnostic preference optimization for medical image segmentation, which consists of 4 stages: (1) Warm-up stage, (2) Preference set generation using dropout, (3) preference optimization, (4) Online preference training.
  • Figure 2: Validation loss ($\mathcal{L}_{CE}+\mathcal{L}_{Dice}$) v.s. training epoch graph obtained in online preference training of U-Net ronneberger2015unet and TransAttUNet chen2021transattunet on EBHI dataset li2023ebhi. They gradually decrease as we update the preference dataset every 50 epochs.
  • Figure 3: Visualization of pixel-level variance using different stochastic sampling strategies with U-Net ronneberger2015unet.
  • Figure 4: Qualitative comparison between our proposed method and baseline using U-Net ronneberger2015unet architecture.