Table of Contents
Fetching ...

Concept-Based Explanations in Computer Vision: Where Are We and Where Could We Go?

Jae Hee Lee, Georgii Mikriukov, Gesina Schwalbe, Stefan Wermter, Diedrich Wolter

TL;DR

This paper reviews C-XAI methods to identify interesting and underexplored areas and proposes future research directions, and considers three main directions: the choice of concepts to explain, the choice of concept representation, and how to control concepts.

Abstract

Concept-based XAI (C-XAI) approaches to explaining neural vision models are a promising field of research, since explanations that refer to concepts (i.e., semantically meaningful parts in an image) are intuitive to understand and go beyond saliency-based techniques that only reveal relevant regions. Given the remarkable progress in this field in recent years, it is time for the community to take a critical look at the advances and trends. Consequently, this paper reviews C-XAI methods to identify interesting and underexplored areas and proposes future research directions. To this end, we consider three main directions: the choice of concepts to explain, the choice of concept representation, and how we can control concepts. For the latter, we propose techniques and draw inspiration from the field of knowledge representation and learning, showing how this could enrich future C-XAI research.

Concept-Based Explanations in Computer Vision: Where Are We and Where Could We Go?

TL;DR

This paper reviews C-XAI methods to identify interesting and underexplored areas and proposes future research directions, and considers three main directions: the choice of concepts to explain, the choice of concept representation, and how to control concepts.

Abstract

Concept-based XAI (C-XAI) approaches to explaining neural vision models are a promising field of research, since explanations that refer to concepts (i.e., semantically meaningful parts in an image) are intuitive to understand and go beyond saliency-based techniques that only reveal relevant regions. Given the remarkable progress in this field in recent years, it is time for the community to take a critical look at the advances and trends. Consequently, this paper reviews C-XAI methods to identify interesting and underexplored areas and proposes future research directions. To this end, we consider three main directions: the choice of concepts to explain, the choice of concept representation, and how we can control concepts. For the latter, we propose techniques and draw inspiration from the field of knowledge representation and learning, showing how this could enrich future C-XAI research.
Paper Structure (27 sections, 5 figures)

This paper contains 27 sections, 5 figures.

Figures (5)

  • Figure 1: Overview of envisaged methodology for model understanding and control. Using rich and relational concept annotations (e.g., grounded in an ontology) of visual model inputs, intuitive concepts and relations are associated with global, expressive, and semantically faithful concept representations in the model's latent space (e.g., distributions). This allows interactive knowledge verification and local or global control, e.g., adjusting the concept representation to globally separate the concept boiler from the concept chimney.
  • Figure 2: Illustration of Net2Vec fong2018net2vec for associating a concept with a linear separator with weight vector $w_c$ in (activation pixel) latent space (left), and illustration of typical concept representation variants (center: direction-based, right: cluster-based).
  • Figure 3: Illustration of the ontological commitment (\ref{['fig:class-hierarchies']}, right), and complex concept distribution (\ref{['fig:gcpv-distribution']}, left) in actual vision model's latent spaces.
  • Figure 4: Detailed taxonomy of state-of-the-art C-XAI methods.
  • Figure 5: Some creation steps of \ref{['fig:gcpv-distribution']}, from left to right: (1) Given the local concept embedding vectors, apply UMAP for dimensionality reduction; (2) fit Gaussian mixture models and determine boundaries of standard deviations; (3) add background shading to indicate most probable concepts (here shown for all 3 graphics), and overlay everything.