On Efficient Variants of Segment Anything Model: A Survey
Xiaorui Sun, Jun Liu, Heng Tao Shen, Xiaofeng Zhu, Ping Hu
TL;DR
This survey analyzes the growing ecosystem of efficient Segment Anything Model (SAM) variants, detailing how lightweight backbones, distillation, quantization, pruning, and refactoring reduce latency while preserving segmentation quality. It introduces a structured taxonomy of approaches for accelerating SegAny and SegEvery tasks and provides a unified evaluation across COCO, LVIS, SGinW, and UVO to compare efficiency and accuracy. Key contributions include a comprehensive catalog of methods (from training-from-scratch to encoder-level distillation and sampler optimizations), and practical guidance for hardware-specific deployment. The findings show that carefully designed backbones (e.g., EfficientViT-SAM, NanoSAM) and sampling strategies can dramatically improve throughput on edge devices and CPUs with only modest accuracy trade-offs, guiding future research toward hybrid architectures, sparsity, and multi-domain universal segmentation.
Abstract
The Segment Anything Model (SAM) is a foundational model for image segmentation tasks, known for its strong generalization across diverse applications. However, its impressive performance comes with significant computational and resource demands, making it challenging to deploy in resource-limited environments such as edge devices. To address this, a variety of SAM variants have been proposed to enhance efficiency while keeping accuracy. This survey provides the first comprehensive review of these efficient SAM variants. We begin by exploring the motivations driving this research. We then present core techniques used in SAM and model acceleration. This is followed by a detailed exploration of SAM acceleration strategies, categorized by approach, and a discussion of several future research directions. Finally, we offer a unified and extensive evaluation of these methods across various hardware, assessing their efficiency and accuracy on representative benchmarks, and providing a clear comparison of their overall performance.
