Matching-Based Few-Shot Semantic Segmentation Models Are Interpretable by Design
Pasquale De Marinis, Uzay Kaymak, Rogier Brussee, Gennaro Vessio, Giovanna Castellano
TL;DR
This work tackles interpretability in few-shot semantic segmentation (FSS) by introducing Affinity Explainer (AffEx), which exploits the intrinsic pixel-level matching structure of matching-based FSS models to produce attribution maps over support images. AffEx offers three variants—Unmasked, Masked, and Signed—that derive per-layer contributions from the matching scores across multiple feature levels, aggregated via layer-wise ablation weights and softmax normalization. The authors extend evaluation with causal metrics IAUC/DAUC and create mIoU-based variants to quantify explanation usefulness, demonstrating that AffEx outperforms standard attribution methods on COCO $20^{i}$ and Pascal $5^{i}$ across two representative models (DCAMA and DMTNet) in both quantitative and qualitative analyses, while maintaining reasonable computational efficiency. The paper also provides a comprehensive ablation study, computational-cost analysis, and supplementary LIME adaptations, establishing a foundation for interpretable FSS and outlining directions to broaden applicability to more FSS architectures and hybrid explainability strategies.
Abstract
Few-Shot Semantic Segmentation (FSS) models achieve strong performance in segmenting novel classes with minimal labeled examples, yet their decision-making processes remain largely opaque. While explainable AI has advanced significantly in standard computer vision tasks, interpretability in FSS remains virtually unexplored despite its critical importance for understanding model behavior and guiding support set selection in data-scarce scenarios. This paper introduces the first dedicated method for interpreting matching-based FSS models by leveraging their inherent structural properties. Our Affinity Explainer approach extracts attribution maps that highlight which pixels in support images contribute most to query segmentation predictions, using matching scores computed between support and query features at multiple feature levels. We extend standard interpretability evaluation metrics to the FSS domain and propose additional metrics to better capture the practical utility of explanations in few-shot scenarios. Comprehensive experiments on FSS benchmark datasets, using different models, demonstrate that our Affinity Explainer significantly outperforms adapted standard attribution methods. Qualitative analysis reveals that our explanations provide structured, coherent attention patterns that align with model architectures and and enable effective model diagnosis. This work establishes the foundation for interpretable FSS research, enabling better model understanding and diagnostic for more reliable few-shot segmentation systems. The source code is publicly available at https://github.com/pasqualedem/AffinityExplainer.
