Explainable Visual Anomaly Detection via Concept Bottleneck Models

Arianna Stropeni; Valentina Zaccaria; Francesco Borsatti; Davide Dalle Pezze; Manuel Barusco; Gian Antonio Susto

Explainable Visual Anomaly Detection via Concept Bottleneck Models

Arianna Stropeni, Valentina Zaccaria, Francesco Borsatti, Davide Dalle Pezze, Manuel Barusco, Gian Antonio Susto

TL;DR

This paper addresses the interpretability gap in Visual Anomaly Detection (VAD) by adapting Concept Bottleneck Models (CBMs) to learn and utilize human-understandable concepts for anomaly predictions. It introduces CONVAD, which couples a concept extractor $g: \mathcal{X} \rightarrow \mathcal{C}$ with a predictor $f: \mathcal{C} \rightarrow \mathcal{Y}$ so that predictions $\hat{y}=f(g(\mathbf{x}))$ are explainable via intermediate concepts, and supports test-time interventions on $\hat{c}$ to improve accuracy. The approach adds a Concept Dataset Pipeline to automatically annotate industrial images, a Visual Explanation module via a student-teacher distillation to localize anomalies, and a Synthetic Anomaly Generation (SAG) pipeline to maintain the unsupervised nature of VAD. Empirical results on the MVTec dataset show CONVAD achieving competitive image- and pixel-level performance while providing richer, concept-driven explanations; interventions on concepts further boost performance, highlighting the value of human-in-the-loop control in VAD. The work suggests directions for future enhancement of novelty detection and refined synthetic anomaly generation to further close gaps between synthetic and real anomaly distributions.

Abstract

In recent years, Visual Anomaly Detection (VAD) has gained significant attention due to its ability to identify anomalous images using only normal images during training. Many VAD models work without supervision but are still able to provide visual explanations by highlighting the anomalous regions within an image. However, although these visual explanations can be helpful, they lack a direct and semantically meaningful interpretation for users. To address this limitation, we propose extending Concept Bottleneck Models (CBMs) to the VAD setting. By learning meaningful concepts, the network can provide human-interpretable descriptions of anomalies, offering a novel and more insightful way to explain them. Our contributions are threefold: (i) we develop a Concept Dataset to support research on CBMs for VAD; (ii) we improve the CBM architecture to generate both concept-based and visual explanations, bridging semantic and localization interpretability; and (iii) we introduce a pipeline for synthesizing artificial anomalies, preserving the VAD paradigm of minimizing dependence on rare anomalous samples. Our approach, Concept-Aware Visual Anomaly Detection (CONVAD), achieves performance comparable to classic VAD methods while providing richer, concept-driven explanations that enhance interpretability and trust in VAD systems.

Explainable Visual Anomaly Detection via Concept Bottleneck Models

TL;DR

Abstract

Explainable Visual Anomaly Detection via Concept Bottleneck Models

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (7)