Table of Contents
Fetching ...

LIMIS: Towards Language-based Interactive Medical Image Segmentation

Lena Heinemann, Alexander Jaus, Zdravko Marinov, Moon Kim, Maria Francesca Spadea, Jens Kleesiek, Rainer Stiefelhagen

TL;DR

LIMIS is introduced, the first purely language-based interactive medical image segmentation model, which produces high-quality initial segmentation masks and allows users to adapt segmentation masks using only language, opening up interactive segmentation to scenarios where physicians require using their hands for other tasks.

Abstract

Within this work, we introduce LIMIS: The first purely language-based interactive medical image segmentation model. We achieve this by adapting Grounded SAM to the medical domain and designing a language-based model interaction strategy that allows radiologists to incorporate their knowledge into the segmentation process. LIMIS produces high-quality initial segmentation masks by leveraging medical foundation models and allows users to adapt segmentation masks using only language, opening up interactive segmentation to scenarios where physicians require using their hands for other tasks. We evaluate LIMIS on three publicly available medical datasets in terms of performance and usability with experts from the medical domain confirming its high-quality segmentation masks and its interactive usability.

LIMIS: Towards Language-based Interactive Medical Image Segmentation

TL;DR

LIMIS is introduced, the first purely language-based interactive medical image segmentation model, which produces high-quality initial segmentation masks and allows users to adapt segmentation masks using only language, opening up interactive segmentation to scenarios where physicians require using their hands for other tasks.

Abstract

Within this work, we introduce LIMIS: The first purely language-based interactive medical image segmentation model. We achieve this by adapting Grounded SAM to the medical domain and designing a language-based model interaction strategy that allows radiologists to incorporate their knowledge into the segmentation process. LIMIS produces high-quality initial segmentation masks by leveraging medical foundation models and allows users to adapt segmentation masks using only language, opening up interactive segmentation to scenarios where physicians require using their hands for other tasks. We evaluate LIMIS on three publicly available medical datasets in terms of performance and usability with experts from the medical domain confirming its high-quality segmentation masks and its interactive usability.

Paper Structure

This paper contains 12 sections, 3 figures, 3 tables.

Figures (3)

  • Figure 1: Top: Manual Language-based Adaptation options. Bottom: LIMIS flowchart showing user input processing from language prompt to final mask via Grounding DINO (Lang2BBox), ScribblePrompt (BBox2Mask), and User Interaction Loop.
  • Figure 2: Dice score over interaction steps for two images. Step 0 is the initial mask; if "default" was accepted, it's step 1.Big circles mark the user’s final chosen mask. Stars indicate when a non-latest step was adapted, marking both the adapted and resulting steps.
  • Figure 3: Liver segmentation mask over iteration steps. The first image shows the CT scan, the second the ground truth (gt), and the third the initial LIMIS prediction. "Default" presents the mask after the default option, and the last two images show masks from steps 2 and 3.