PA-SAM: Prompt Adapter SAM for High-Quality Image Segmentation
Zhaozhi Xie, Bochen Guan, Weihao Jiang, Muyang Yi, Yue Ding, Hongtao Lu, Lei Zhang
TL;DR
PA-SAM addresses SAM's shortfall in generating high-quality masks by introducing a trainable Prompt Adapter that enriches both dense and sparse prompts while keeping SAM frozen. The adapter enables adaptive detail enhancement and hard point mining, converting image details into refined prompt features that steer the mask decoder. On HQSeg-44K, PA-SAM achieves average improvements of $2.1\%$ in $mIoU$ and $2.7\%$ in $mBIoU$ over HQ-SAM, and it also shows robust zero-shot and open-set performance with less sensitivity to detector errors. The work delivers a practical, lightweight enhancement to SAM with open-source code and models.
Abstract
The Segment Anything Model (SAM) has exhibited outstanding performance in various image segmentation tasks. Despite being trained with over a billion masks, SAM faces challenges in mask prediction quality in numerous scenarios, especially in real-world contexts. In this paper, we introduce a novel prompt-driven adapter into SAM, namely Prompt Adapter Segment Anything Model (PA-SAM), aiming to enhance the segmentation mask quality of the original SAM. By exclusively training the prompt adapter, PA-SAM extracts detailed information from images and optimizes the mask decoder feature at both sparse and dense prompt levels, improving the segmentation performance of SAM to produce high-quality masks. Experimental results demonstrate that our PA-SAM outperforms other SAM-based methods in high-quality, zero-shot, and open-set segmentation. We're making the source code and models available at https://github.com/xzz2/pa-sam.
