NubbleDrop: A Simple Way to Improve Matching Strategy for Prompted One-Shot Segmentation
Zhiyu Xu, Qingliang Chen
TL;DR
This work analyzes two core weaknesses in prompt-based, SAM-driven one-shot segmentation: that patch similarity derived from raw features is distorted by complex feature interactions, and that channel-value distributions are uneven, giving dominance to a few channels. It proposes NubbleDrop, a training-free method that randomly drops feature channels during matching to mitigate deceptive channels with negligible overhead. Across COCO-20i, LVIS-92i, FSS-1000, and PASCAL-Part, and over multiple vision foundation models, MN (Matcher with NubbleDrop) achieves notable gains (e.g., 53.5 mIoU on COCO-20i and 34.0% on LVIS-92i) and demonstrates strong cross-backbone improvements, underscoring the method’s robustness and transferability. The results imply that simple, low-cost channel perturbations can meaningfully improve prompting-based segmentation when facing imperfect feature representations, with potential applicability to a broader set of similarity computing tasks.
Abstract
Driven by large data trained segmentation models, such as SAM , research in one-shot segmentation has experienced significant advancements. Recent contributions like PerSAM and MATCHER , presented at ICLR 2024, utilize a similar approach by leveraging SAM with one or a few reference images to generate high quality segmentation masks for target images. Specifically, they utilize raw encoded features to compute cosine similarity between patches within reference and target images along the channel dimension, effectively generating prompt points or boxes for the target images a technique referred to as the matching strategy. However, relying solely on raw features might introduce biases and lack robustness for such a complex task. To address this concern, we delve into the issues of feature interaction and uneven distribution inherent in raw feature based matching. In this paper, we propose a simple and training-free method to enhance the validity and robustness of the matching strategy at no additional computational cost (NubbleDrop). The core concept involves randomly dropping feature channels (setting them to zero) during the matching process, thereby preventing models from being influenced by channels containing deceptive information. This technique mimics discarding pathological nubbles, and it can be seamlessly applied to other similarity computing scenarios. We conduct a comprehensive set of experiments, considering a wide range of factors, to demonstrate the effectiveness and validity of our proposed method. Our results showcase the significant improvements achieved through this simmple and straightforward approach.
