ProtoMask: Segmentation-Guided Prototype Learning
Steffen Meinert, Philipp Schlinge, Nils Strodthoff, Martin Atzmueller
TL;DR
ProtoMask addresses explainability gaps in prototype-based image classifiers by tying prototypes to object parts through segmentation-guided multi-view representations. It extends ProtoPNet with a multi-view pipeline where segmentation masks generate views that feed embeddings, which are evaluated against prototypes via a distance-based similarity and a max-pooling strategy for classification. A multi-term loss encourages accurate classification, well-separated latent spaces, diverse prototypes, and sparsity, implemented through a two-stage training procedure that includes prototype projection. Across CUB200-2011, Stanford Dogs, and Stanford Cars, ProtoMask achieves competitive accuracy and improved explainability metrics, though its performance is sensitive to segmentation mask quality and cropping strategies.
Abstract
XAI gained considerable importance in recent years. Methods based on prototypical case-based reasoning have shown a promising improvement in explainability. However, these methods typically rely on additional post-hoc saliency techniques to explain the semantics of learned prototypes. Multiple critiques have been raised about the reliability and quality of such techniques. For this reason, we study the use of prominent image segmentation foundation models to improve the truthfulness of the mapping between embedding and input space. We aim to restrict the computation area of the saliency map to a predefined semantic image patch to reduce the uncertainty of such visualizations. To perceive the information of an entire image, we use the bounding box from each generated segmentation mask to crop the image. Each mask results in an individual input in our novel model architecture named ProtoMask. We conduct experiments on three popular fine-grained classification datasets with a wide set of metrics, providing a detailed overview on explainability characteristics. The comparison with other popular models demonstrates competitive performance and unique explainability features of our model. https://github.com/uos-sis/quanproto
