MediRound: Multi-Round Entity-Level Reasoning Segmentation in Medical Images
Qinyue Tong, Ziqian Lu, Jun Liu, Rui Zuo, Zheming Lu
TL;DR
This work defines MEMR-Seg, a task for multi-round, entity-level reasoning in medical image segmentation, and introduces MR-MedSeg, a large-scale dataset of 177K multi-round dialogues built from SA-Med2D-20M with GPT-5 augmentation. It proposes MediRound, a baseline model that fuses prior-round masks and dialogue history via an extended LLM–vision pipeline, augmented by a lightweight Judgment & Correction Mechanism to curb error propagation across rounds. Empirical results show MediRound outperforms traditional medical referring segmentation methods and SegLLM-style baselines in multi-round settings, while maintaining strong single-round performance. The work highlights the practical potential of interactive, reasoning-driven segmentation in clinical workflows and provides a foundation for future research in cross-round medical vision–language interaction.
Abstract
Despite the progress in medical image segmentation, most existing methods remain task-specific and lack interactivity. Although recent text-prompt-based segmentation approaches enhance user-driven and reasoning-based segmentation, they remain confined to single-round dialogues and fail to perform multi-round reasoning. In this work, we introduce Multi-Round Entity-Level Medical Reasoning Segmentation (MEMR-Seg), a new task that requires generating segmentation masks through multi-round queries with entity-level reasoning. To support this task, we construct MR-MedSeg, a large-scale dataset of 177K multi-round medical segmentation dialogues, featuring entity-based reasoning across rounds. Furthermore, we propose MediRound, an effective baseline model designed for multi-round medical reasoning segmentation. To mitigate the inherent error propagation in the chain-like pipeline of multi-round segmentation, we introduce a lightweight yet effective Judgment & Correction Mechanism during model inference. Experimental results demonstrate that our method effectively addresses the MEMR-Seg task and outperforms conventional medical referring segmentation methods.
