Table of Contents
Fetching ...

Few Exemplar-Based General Medical Image Segmentation via Domain-Aware Selective Adaptation

Chen Xu, Qiming Huang, Yuqi Hou, Jiangxing Wu, Fan Zhang, Hyung Jin Chang, Jianbo Jiao

TL;DR

A domain-aware selective adaptation approach to adapt the general knowledge learned from a large model trained with natural images to the corresponding medical domains/modalities, with access to only a few exemplars is introduced, providing an efficient and LMICs-friendly solution.

Abstract

Medical image segmentation poses challenges due to domain gaps, data modality variations, and dependency on domain knowledge or experts, especially for low- and middle-income countries (LMICs). Whereas for humans, given a few exemplars (with corresponding labels), we are able to segment different medical images even without exten-sive domain-specific clinical training. In addition, current SAM-based medical segmentation models use fine-grained visual prompts, such as the bounding rectangle generated from manually annotated target segmentation mask, as the bounding box (bbox) prompt during the testing phase. However, in actual clinical scenarios, no such precise prior knowledge is available. Our experimental results also reveal that previous models nearly fail to predict when given coarser bbox prompts. Considering these issues, in this paper, we introduce a domain-aware selective adaptation approach to adapt the general knowledge learned from a large model trained with natural images to the corresponding medical domains/modalities, with access to only a few (e.g. less than 5) exemplars. Our method mitigates the aforementioned limitations, providing an efficient and LMICs-friendly solution. Extensive experimental analysis showcases the effectiveness of our approach, offering potential advancements in healthcare diagnostics and clinical applications in LMICs.

Few Exemplar-Based General Medical Image Segmentation via Domain-Aware Selective Adaptation

TL;DR

A domain-aware selective adaptation approach to adapt the general knowledge learned from a large model trained with natural images to the corresponding medical domains/modalities, with access to only a few exemplars is introduced, providing an efficient and LMICs-friendly solution.

Abstract

Medical image segmentation poses challenges due to domain gaps, data modality variations, and dependency on domain knowledge or experts, especially for low- and middle-income countries (LMICs). Whereas for humans, given a few exemplars (with corresponding labels), we are able to segment different medical images even without exten-sive domain-specific clinical training. In addition, current SAM-based medical segmentation models use fine-grained visual prompts, such as the bounding rectangle generated from manually annotated target segmentation mask, as the bounding box (bbox) prompt during the testing phase. However, in actual clinical scenarios, no such precise prior knowledge is available. Our experimental results also reveal that previous models nearly fail to predict when given coarser bbox prompts. Considering these issues, in this paper, we introduce a domain-aware selective adaptation approach to adapt the general knowledge learned from a large model trained with natural images to the corresponding medical domains/modalities, with access to only a few (e.g. less than 5) exemplars. Our method mitigates the aforementioned limitations, providing an efficient and LMICs-friendly solution. Extensive experimental analysis showcases the effectiveness of our approach, offering potential advancements in healthcare diagnostics and clinical applications in LMICs.

Paper Structure

This paper contains 21 sections, 2 equations, 5 figures, 3 tables, 1 algorithm.

Figures (5)

  • Figure 1: Illustration of the relative relationship and distribution of samples from different datasets, using SMG2020 for feature visualisation. A clear domain gap can be observed between medical data (red points) and general natural images (blue/green points).
  • Figure 2: Four settings of using bbox prompts during training and testing stages. The coarse bounding box prompt is designed to be GT-agnostic, with different ratios indicating the proportion of pixels by which the box region is shrunk inward relative to the entire image. Pseudo-code for coarse bbox prompt generation is shown in Algorithm \ref{['alg:coarse']}.
  • Figure 3: The proposed FEMed architecture. The pre-trained SAM image encoder is equipped with two specialised Adapters: (a) the Multi-Scale Features Adapter that captures features at various granularities through pyramid pooling, and (b) the High-Frequency Adapter that emphasises salient textural details via frequency domain analysis. The output features from these Adapters are fed into the Selection Module which contains a trainable decision layer that takes $F_k^I$ (where $k$ refers to the features from the $k$-th layer) as input to generate the weights for aggregating $F_f$ and $F_p$.
  • Figure 4: Qualitative performance across three medical datasets (LiTS17, BraTS2021, and Kvasir-Seg) using different methods: MedSAM ma2024segment, SAM-MED2D cheng2023sam, and our proposed method ("Ours"). For each method, we show the segmentation results with different numbers of exemplars (i.e.1, 5, and 10).
  • Figure 5: The effect of varying bounding box overlapping rates (refers to the proportion of pixels by which the box region is shrunk inward relative to the entire image, i.e. the rate in Fig. \ref{['compa']}). All results are reported via training with a single exemplar.