Towards Collective Intelligence: Uncertainty-aware SAM Adaptation for Ambiguous Medical Image Segmentation
Mingzhou Jiang, Jiaying Zhou, Junde Wu, Tianyang Wang, Yueming Jin, Min Xu
TL;DR
This paper addresses the challenge of ambiguous medical image segmentation where multiple experts provide valid yet differing interpretations. It introduces UA-SAM, an uncertainty-aware adapter that learns a multi-expert latent distribution via a conditional variational autoencoder and aligns uncertainty with adapter positions and prompts to generate diverse, clinically plausible segmentations. The approach enables a one-to-many mapping from input images to plausible segmentation masks, capturing expert variability while maintaining efficiency through parameter-efficient adapter design, including Position-conditioned Attention and Prompt Channel Attention. Across seven multi-expert benchmarks, UA-SAM achieves state-of-the-art consensus performance and favorable distributional alignment (GED), demonstrating potential to enhance clinical reliability and interpretability in ambiguous segmentation tasks.
Abstract
Collective intelligence from multiple medical experts consistently surpasses individual expertise in clinical diagnosis, particularly for ambiguous medical image segmentation tasks involving unclear tissue boundaries or pathological variations. The Segment Anything Model (SAM), a powerful vision foundation model originally designed for natural image segmentation, has shown remarkable potential when adapted to medical image segmentation tasks. However, existing SAM adaptation methods follow a single-expert paradigm, developing models based on individual expert annotations to predict deterministic masks. These methods systematically ignore the inherent uncertainty and variability in expert annotations, which fundamentally contradicts clinical practice, where multiple specialists provide different yet equally valid interpretations that collectively enhance diagnostic confidence. We propose an Uncertainty-aware Adapter, the first SAM adaptation framework designed to transition from single expert mindset to collective intelligence representation. Our approach integrates stochastic uncertainty sampling from a Conditional Variational Autoencoder into the adapters, enabling diverse prediction generation that captures expert knowledge distributions rather than individual expert annotations. We employ a novel position-conditioned control mechanism to integrate multi-expert knowledge, ensuring that the output distribution closely aligns with the multi-annotation distribution. Comprehensive evaluations across seven medical segmentation benchmarks have demonstrated that our collective intelligence-based adaptation achieves superior performance while maintaining computational efficiency, establishing a new adaptation framework for reliable clinical implementation.
