Beyond Patches: Mining Interpretable Part-Prototypes for Explainable AI
Mahdi Alehdaghi, Rajarshi Bhattacharya, Pourya Shamsolmoali, Rafael M. O. Cruz, Maguelonne Heritier, Eric Granger
TL;DR
PCMNet tackles the interpretability gap in deep vision by learning adaptive part-prototypes and organizing them into semantically coherent concepts that support human-aligned explanations. It introduces a three-stage pipeline: unsupervised part-prototype discovery, class-specific concept clustering to form centroids, and concept-activation mining that yields a sparse, interpretable decision vector (CAVs) used for classification. The approach, enabled by Marginal Cluster Center Loss and the Concept Mining Module, achieves strong performance on explainability metrics and robustness to occlusion while maintaining competitive accuracy and efficiency relative to prior prototype-based XAI methods. This work advances AI alignment by providing transparent, controllable, and robust explanations suitable for high-stakes applications.
Abstract
As AI systems grow more capable, it becomes increasingly important that their decisions remain understandable and aligned with human expectations. A key challenge is the limited interpretability of deep models. Post-hoc methods like GradCAM offer heatmaps but provide limited conceptual insight, while prototype-based approaches offer example-based explanations but often rely on rigid region selection and lack semantic consistency. To address these limitations, we propose PCMNet, a part-prototypical concept mining network that learns human-comprehensible prototypes from meaningful image regions without additional supervision. By clustering these prototypes into concept groups and extracting concept activation vectors, PCMNet provides structured, concept-level explanations and enhances robustness to occlusion and challenging conditions, which are both critical for building reliable and aligned AI systems. Experiments across multiple image classification benchmarks show that PCMNet outperforms state-of-the-art methods in interpretability, stability, and robustness. This work contributes to AI alignment by enhancing transparency, controllability, and trustworthiness in AI systems. Our code is available at: https://github.com/alehdaghi/PCMNet.
