Efficient In-Context Medical Segmentation with Meta-driven Visual Prompt Selection
Chenwei Wu, David Restrepo, Zitao Shuai, Zhongming Liu, Liyue Shen
TL;DR
MVPS addresses the sensitivity of in-context medical segmentation to prompt choice and domain shift by learning a meta-driven visual prompt retriever. It meta-trains a transformer-based retriever to select informative image-mask prompts from a support pool while keeping the large vision model frozen, using a Dice-based reward and policy-gradient optimization, with optional task augmentation and test-time adaptation. The approach yields consistent gains across 8 datasets, 4 tasks, and 3 modalities, demonstrating a data-centric, tuning-free enhancement that is compatible with multiple backbones and can complement model-centric methods like LoRA. This work enables label-efficient, cross-domain medical segmentation with practical potential for scalable deployment in diverse clinical settings.
Abstract
In-context learning (ICL) with Large Vision Models (LVMs) presents a promising avenue in medical image segmentation by reducing the reliance on extensive labeling. However, the ICL performance of LVMs highly depends on the choices of visual prompts and suffers from domain shifts. While existing works leveraging LVMs for medical tasks have focused mainly on model-centric approaches like fine-tuning, we study an orthogonal data-centric perspective on how to select good visual prompts to facilitate generalization to medical domain. In this work, we propose a label-efficient in-context medical segmentation method by introducing a novel Meta-driven Visual Prompt Selection mechanism (MVPS), where a prompt retriever obtained from a meta-learning framework actively selects the optimal images as prompts to promote model performance and generalizability. Evaluated on 8 datasets and 4 tasks across 3 medical imaging modalities, our proposed approach demonstrates consistent gains over existing methods under different scenarios, improving both computational and label efficiency. Finally, we show that MVPS is a flexible, finetuning-free module that could be easily plugged into different backbones and combined with other model-centric approaches.
