Diversified and Personalized Multi-rater Medical Image Segmentation
Yicheng Wu, Xiangde Luo, Zhe Xu, Xiaoqing Guo, Lie Ju, Zongyuan Ge, Wenjun Liao, Jianfei Cai
TL;DR
This work tackles annotation ambiguity in medical image segmentation by introducing D-Persona, a two-stage framework that first learns a shared latent space to capture diverse expert opinions and then derives per-expert prompts via attention-based projections for personalized segmentation. Stage I employs a bound-constrained loss with a probabilistic U-Net backbone to broaden segmentation diversity, while Stage II uses multiple projection heads and cross-attention against a prior bank to deliver expert-specific outputs without retraining the core model. Evaluations on NPC-170 and LIDC-IDRI show state-of-the-art performance in both diversification and personalization metrics, underscoring the method's ability to provide multiple plausible segmentations alongside personalized predictions. The approach offers practical benefits for clinical workflows by enabling diverse opinions and individualized analysis within a single framework, with code to be released for reproducibility.
Abstract
Annotation ambiguity due to inherent data uncertainties such as blurred boundaries in medical scans and different observer expertise and preferences has become a major obstacle for training deep-learning based medical image segmentation models. To address it, the common practice is to gather multiple annotations from different experts, leading to the setting of multi-rater medical image segmentation. Existing works aim to either merge different annotations into the "groundtruth" that is often unattainable in numerous medical contexts, or generate diverse results, or produce personalized results corresponding to individual expert raters. Here, we bring up a more ambitious goal for multi-rater medical image segmentation, i.e., obtaining both diversified and personalized results. Specifically, we propose a two-stage framework named D-Persona (first Diversification and then Personalization). In Stage I, we exploit multiple given annotations to train a Probabilistic U-Net model, with a bound-constrained loss to improve the prediction diversity. In this way, a common latent space is constructed in Stage I, where different latent codes denote diversified expert opinions. Then, in Stage II, we design multiple attention-based projection heads to adaptively query the corresponding expert prompts from the shared latent space, and then perform the personalized medical image segmentation. We evaluated the proposed model on our in-house Nasopharyngeal Carcinoma dataset and the public lung nodule dataset (i.e., LIDC-IDRI). Extensive experiments demonstrated our D-Persona can provide diversified and personalized results at the same time, achieving new SOTA performance for multi-rater medical image segmentation. Our code will be released at https://github.com/ycwu1997/D-Persona.
