Table of Contents
Fetching ...

UniUltra: Interactive Parameter-Efficient SAM2 for Universal Ultrasound Segmentation

Yue Li, Qing Xu, Yixuan Zhang, Xiangjian He, Qian Zhang, Yuan Yao, Fiseha B. Tesem, Xin Chen, Ruili Wang, Zhen Chen, Chang Wen Chen

TL;DR

UniUltra tackles the domain gap between natural-image SAM2 pretraining and ultrasound segmentation by introducing a context-edge hybrid adapter (CH-Adapter) for parameter-efficient fine-tuning, and a deep-supervised knowledge distillation (DSKD) pipeline to compress a fine-tuned model into lightweight encoders suitable for clinical settings. CH-Adapter combines a context-aware prompt generator with a four-directional edge-aware Sobel module, reducing trainable parameters to 8.91% during fine-tuning and improving boundary delineation in ultrasound images. DSKD transfers domain-specific knowledge from the teacher SAM2 to a compact student across three hierarchical levels, achieving large reductions in parameters (UniUltra-mini down to 0.86M) and FLOPs while maintaining high segmentation performance. Extensive evaluations on six public ultrasound datasets (internal and external) demonstrate superior generalization, efficiency, and memory footprint, indicating strong potential for real-world bedside deployment in resource-constrained environments.

Abstract

The Segment Anything Model 2 (SAM2) demonstrates remarkable universal segmentation capabilities on natural images. However, its performance on ultrasound images is significantly degraded due to domain disparities. This limitation raises two critical challenges: how to efficiently adapt SAM2 to ultrasound imaging while maintaining parameter efficiency, and how to deploy the adapted model effectively in resource-constrained clinical environments. To address these issues, we propose UniUltra for universal ultrasound segmentation. Specifically, we first introduce a novel context-edge hybrid adapter (CH-Adapter) that enhances fine-grained perception across diverse ultrasound imaging modalities while achieving parameter-efficient fine-tuning. To further improve clinical applicability, we develop a deep-supervised knowledge distillation (DSKD) technique that transfers knowledge from the large image encoder of the fine-tuned SAM2 to a super lightweight encoder, substantially reducing computational requirements without compromising performance. Extensive experiments demonstrate that UniUltra outperforms state-of-the-arts with superior generalization capabilities. Notably, our framework achieves competitive performance using only 8.91% of SAM2's parameters during fine-tuning, and the final compressed model reduces the parameter count by 94.08% compared to the original SAM2, making it highly suitable for practical clinical deployment. The source code is available at https://github.com/xq141839/UniUltra.

UniUltra: Interactive Parameter-Efficient SAM2 for Universal Ultrasound Segmentation

TL;DR

UniUltra tackles the domain gap between natural-image SAM2 pretraining and ultrasound segmentation by introducing a context-edge hybrid adapter (CH-Adapter) for parameter-efficient fine-tuning, and a deep-supervised knowledge distillation (DSKD) pipeline to compress a fine-tuned model into lightweight encoders suitable for clinical settings. CH-Adapter combines a context-aware prompt generator with a four-directional edge-aware Sobel module, reducing trainable parameters to 8.91% during fine-tuning and improving boundary delineation in ultrasound images. DSKD transfers domain-specific knowledge from the teacher SAM2 to a compact student across three hierarchical levels, achieving large reductions in parameters (UniUltra-mini down to 0.86M) and FLOPs while maintaining high segmentation performance. Extensive evaluations on six public ultrasound datasets (internal and external) demonstrate superior generalization, efficiency, and memory footprint, indicating strong potential for real-world bedside deployment in resource-constrained environments.

Abstract

The Segment Anything Model 2 (SAM2) demonstrates remarkable universal segmentation capabilities on natural images. However, its performance on ultrasound images is significantly degraded due to domain disparities. This limitation raises two critical challenges: how to efficiently adapt SAM2 to ultrasound imaging while maintaining parameter efficiency, and how to deploy the adapted model effectively in resource-constrained clinical environments. To address these issues, we propose UniUltra for universal ultrasound segmentation. Specifically, we first introduce a novel context-edge hybrid adapter (CH-Adapter) that enhances fine-grained perception across diverse ultrasound imaging modalities while achieving parameter-efficient fine-tuning. To further improve clinical applicability, we develop a deep-supervised knowledge distillation (DSKD) technique that transfers knowledge from the large image encoder of the fine-tuned SAM2 to a super lightweight encoder, substantially reducing computational requirements without compromising performance. Extensive experiments demonstrate that UniUltra outperforms state-of-the-arts with superior generalization capabilities. Notably, our framework achieves competitive performance using only 8.91% of SAM2's parameters during fine-tuning, and the final compressed model reduces the parameter count by 94.08% compared to the original SAM2, making it highly suitable for practical clinical deployment. The source code is available at https://github.com/xq141839/UniUltra.

Paper Structure

This paper contains 27 sections, 4 equations, 6 figures, 10 tables.

Figures (6)

  • Figure 1: The design pipeline of UniUltra. The process begins with data collection, where sonographers acquire and label ultrasound images. Subsequently, the SAM2 model undergoes fine-tuning on the labeled ultrasound data using PEFT-based CH-Adapters with additional touch input for interaction. Finally, DSKD is employed to compress the size of the fine-tuned SAM2, ensuring its universality and real-world clinical applicability.
  • Figure 2: Performance and efficiency comparisons between state-of-the-arts and our UniUltra. Results demonstrate the superior universality of UniUltra across different ultrasound scenarios. In particular, the image encoder of UniUltra-mini uses only 0.86M parameters, revealing remarkable memory efficiency.
  • Figure 3: (a) The overview of the proposed UniUltra for universal ultrasound segmentation, consisting of (b) CH-Adapter and (c) DSKD. UniUltra provides a systematic pipeline that adapts SAM2 from natural to clinical deployment of ultrasound scenarios. Sonographers use bounding boxes on ultrasound equipment to outline target areas as touch input.
  • Figure 4: Visualization of universal ultrasound segmentation on interval validation. Our UniUltra exhibits the best results, recognizing more accurate lesion regions with delineating precise boundaries while having fewer false positives.
  • Figure 5: Ablation study of embedding dimension reduction in UniUltra-mini.
  • ...and 1 more figures