From Segments to Concepts: Interpretable Image Classification via Concept-Guided Segmentation
Ran Eisenberg, Amit Rozner, Ethan Fetaya, Ofir Lindenbaum
TL;DR
SEG-MIL-CBM tackles the interpretability challenge in vision by grounding predictions in semantically meaningful image regions. It combines a CLIP-guided concept segmentation pipeline with an attention-based MIL that treats each segment as an instance and aligns segment concepts with CLIP cues, producing spatially grounded, concept-level explanations without needing concept annotations. Empirically, it improves worst-group accuracy under spurious correlations, maintains strong performance on standard benchmarks, and demonstrates robustness to common corruptions, while offering interpretable region-level reasoning. This approach bridges interpretability and robustness for open-world vision systems, with potential applicability to safety-critical tasks where regional evidence and concept alignment are crucial.
Abstract
Deep neural networks have achieved remarkable success in computer vision; however, their black-box nature in decision-making limits interpretability and trust, particularly in safety-critical applications. Interpretability is crucial in domains where errors have severe consequences. Existing models not only lack transparency but also risk exploiting unreliable or misleading features, which undermines both robustness and the validity of their explanations. Concept Bottleneck Models (CBMs) aim to improve transparency by reasoning through human-interpretable concepts. Still, they require costly concept annotations and lack spatial grounding, often failing to identify which regions support each concept. We propose SEG-MIL-CBM, a novel framework that integrates concept-guided image segmentation into an attention-based multiple instance learning (MIL) framework, where each segmented region is treated as an instance and the model learns to aggregate evidence across them. By reasoning over semantically meaningful regions aligned with high-level concepts, our model highlights task-relevant evidence, down-weights irrelevant cues, and produces spatially grounded, concept-level explanations without requiring annotations of concepts or groups. SEG-MIL-CBM achieves robust performance across settings involving spurious correlations (unintended dependencies between background and label), input corruptions (perturbations that degrade visual quality), and large-scale benchmarks, while providing transparent, concept-level explanations.
