Table of Contents
Fetching ...

nnSAM: Plug-and-play Segment Anything Model Improves nnUNet Performance

Yunxiang Li, Bowen Jing, Zihan Li, Jing Wang, You Zhang

TL;DR

nnSAM addresses the domain gap between foundation models like SAM and domain-tailored segmentation by fusing SAM's robust feature extraction with nnUNet's auto-configured, fully automatic segmentation. It introduces a two-encoder architecture with a dual-head decoder and a level-set curvature-based regression head to learn anatomical shape priors from limited data, achieving superior performance across four medical segmentation tasks, notably Dice scores of 82.77% and ASD of 1.14 mm for brain white matter with 20 training samples. Ablation confirms the essential contributions of both the SAM encoder and the shape-prior head. The approach demonstrates strong potential for high-accuracy, automatic medical image segmentation in low-data regimes, with practical implications for clinical workflows and small-sample learning.

Abstract

Automatic segmentation of medical images is crucial in modern clinical workflows. The Segment Anything Model (SAM) has emerged as a versatile tool for image segmentation without specific domain training, but it requires human prompts and may have limitations in specific domains. Traditional models like nnUNet perform automatic segmentation during inference and are effective in specific domains but need extensive domain-specific training. To combine the strengths of foundational and domain-specific models, we propose nnSAM, integrating SAM's robust feature extraction with nnUNet's automatic configuration to enhance segmentation accuracy on small datasets. Our nnSAM model optimizes two main approaches: leveraging SAM's feature extraction and nnUNet's domain-specific adaptation, and incorporating a boundary shape supervision loss function based on level set functions and curvature calculations to learn anatomical shape priors from limited data. We evaluated nnSAM on four segmentation tasks: brain white matter, liver, lung, and heart segmentation. Our method outperformed others, achieving the highest DICE score of 82.77% and the lowest ASD of 1.14 mm in brain white matter segmentation with 20 training samples, compared to nnUNet's DICE score of 79.25% and ASD of 1.36 mm. A sample size study highlighted nnSAM's advantage with fewer training samples. Our results demonstrate significant improvements in segmentation performance with nnSAM, showcasing its potential for small-sample learning in medical image segmentation.

nnSAM: Plug-and-play Segment Anything Model Improves nnUNet Performance

TL;DR

nnSAM addresses the domain gap between foundation models like SAM and domain-tailored segmentation by fusing SAM's robust feature extraction with nnUNet's auto-configured, fully automatic segmentation. It introduces a two-encoder architecture with a dual-head decoder and a level-set curvature-based regression head to learn anatomical shape priors from limited data, achieving superior performance across four medical segmentation tasks, notably Dice scores of 82.77% and ASD of 1.14 mm for brain white matter with 20 training samples. Ablation confirms the essential contributions of both the SAM encoder and the shape-prior head. The approach demonstrates strong potential for high-accuracy, automatic medical image segmentation in low-data regimes, with practical implications for clinical workflows and small-sample learning.

Abstract

Automatic segmentation of medical images is crucial in modern clinical workflows. The Segment Anything Model (SAM) has emerged as a versatile tool for image segmentation without specific domain training, but it requires human prompts and may have limitations in specific domains. Traditional models like nnUNet perform automatic segmentation during inference and are effective in specific domains but need extensive domain-specific training. To combine the strengths of foundational and domain-specific models, we propose nnSAM, integrating SAM's robust feature extraction with nnUNet's automatic configuration to enhance segmentation accuracy on small datasets. Our nnSAM model optimizes two main approaches: leveraging SAM's feature extraction and nnUNet's domain-specific adaptation, and incorporating a boundary shape supervision loss function based on level set functions and curvature calculations to learn anatomical shape priors from limited data. We evaluated nnSAM on four segmentation tasks: brain white matter, liver, lung, and heart segmentation. Our method outperformed others, achieving the highest DICE score of 82.77% and the lowest ASD of 1.14 mm in brain white matter segmentation with 20 training samples, compared to nnUNet's DICE score of 79.25% and ASD of 1.36 mm. A sample size study highlighted nnSAM's advantage with fewer training samples. Our results demonstrate significant improvements in segmentation performance with nnSAM, showcasing its potential for small-sample learning in medical image segmentation.
Paper Structure (18 sections, 11 equations, 6 figures, 5 tables)

This paper contains 18 sections, 11 equations, 6 figures, 5 tables.

Figures (6)

  • Figure 1: The architecture of nnSAM. nnSAM integrates nnUNet’s encoder with the pre-trained SAM encoder. The correspondingly concatenated embeddings are input into nnUNet’s decoder, which has two output layers: a segmentation header, and a level set-based regression header. The segmentation header serves as the final output, while the regression header assists the model in capturing the shape priors during the training process.
  • Figure 2: Segmentation visualization results for different methods on MR brain white matter segmentation, with the numbers on the left representing different training sample sizes. For the displayed images of each training sample size, a full segmentation (upper row) and a zoomed-in segmentation (lower row) are shown.
  • Figure 3: Segmentation visualization results for different methods on CT heart substructure segmentation.
  • Figure 4: Segmentation visualization results for different methods on CT liver segmentation.
  • Figure 5: Segmentation visualization results for different methods on chest X-ray segmentation.
  • ...and 1 more figures