UniUltra: Interactive Parameter-Efficient SAM2 for Universal Ultrasound Segmentation
Yue Li, Qing Xu, Yixuan Zhang, Xiangjian He, Qian Zhang, Yuan Yao, Fiseha B. Tesem, Xin Chen, Ruili Wang, Zhen Chen, Chang Wen Chen
TL;DR
UniUltra tackles the domain gap between natural-image SAM2 pretraining and ultrasound segmentation by introducing a context-edge hybrid adapter (CH-Adapter) for parameter-efficient fine-tuning, and a deep-supervised knowledge distillation (DSKD) pipeline to compress a fine-tuned model into lightweight encoders suitable for clinical settings. CH-Adapter combines a context-aware prompt generator with a four-directional edge-aware Sobel module, reducing trainable parameters to 8.91% during fine-tuning and improving boundary delineation in ultrasound images. DSKD transfers domain-specific knowledge from the teacher SAM2 to a compact student across three hierarchical levels, achieving large reductions in parameters (UniUltra-mini down to 0.86M) and FLOPs while maintaining high segmentation performance. Extensive evaluations on six public ultrasound datasets (internal and external) demonstrate superior generalization, efficiency, and memory footprint, indicating strong potential for real-world bedside deployment in resource-constrained environments.
Abstract
The Segment Anything Model 2 (SAM2) demonstrates remarkable universal segmentation capabilities on natural images. However, its performance on ultrasound images is significantly degraded due to domain disparities. This limitation raises two critical challenges: how to efficiently adapt SAM2 to ultrasound imaging while maintaining parameter efficiency, and how to deploy the adapted model effectively in resource-constrained clinical environments. To address these issues, we propose UniUltra for universal ultrasound segmentation. Specifically, we first introduce a novel context-edge hybrid adapter (CH-Adapter) that enhances fine-grained perception across diverse ultrasound imaging modalities while achieving parameter-efficient fine-tuning. To further improve clinical applicability, we develop a deep-supervised knowledge distillation (DSKD) technique that transfers knowledge from the large image encoder of the fine-tuned SAM2 to a super lightweight encoder, substantially reducing computational requirements without compromising performance. Extensive experiments demonstrate that UniUltra outperforms state-of-the-arts with superior generalization capabilities. Notably, our framework achieves competitive performance using only 8.91% of SAM2's parameters during fine-tuning, and the final compressed model reduces the parameter count by 94.08% compared to the original SAM2, making it highly suitable for practical clinical deployment. The source code is available at https://github.com/xq141839/UniUltra.
