ProtoSeg: Interpretable Semantic Segmentation with Prototypical Parts
Mikołaj Sacha, Dawid Rymarczyk, Łukasz Struski, Jacek Tabor, Bartosz Zieliński
TL;DR
ProtoSeg reframes semantic segmentation as a prototype-driven task, where pixel predictions are grounded in prototypical patches from the training data. A novel prototype diversity loss based on Jeffrey's divergence encourages same-class prototypes to cover diverse semantic concepts, enhancing interpretability. The approach, compatible with multiple backbones and validated on Pascal VOC and Cityscapes, yields competitive accuracy while increasing transparency through explicit prototype activations. This work advances explainable segmentation by delivering human-understandable, patch-based explanations without requiring external annotation or post hoc reasoning.
Abstract
We introduce ProtoSeg, a novel model for interpretable semantic image segmentation, which constructs its predictions using similar patches from the training set. To achieve accuracy comparable to baseline methods, we adapt the mechanism of prototypical parts and introduce a diversity loss function that increases the variety of prototypes within each class. We show that ProtoSeg discovers semantic concepts, in contrast to standard segmentation models. Experiments conducted on Pascal VOC and Cityscapes datasets confirm the precision and transparency of the presented method.
