When a Relation Tells More Than a Concept: Exploring and Evaluating Classifier Decisions with CoReX

Bettina Finzel; Patrick Hilme; Johannes Rabold; Ute Schmid

When a Relation Tells More Than a Concept: Exploring and Evaluating Classifier Decisions with CoReX

Bettina Finzel, Patrick Hilme, Johannes Rabold, Ute Schmid

TL;DR

The paper tackles the inadequacy of pixel-level explanations for complex CNN decisions by introducing CoReX, a concept- and relation-based explainer. CoReX couples class-conditioned concept relevance (via CRP) with inductive logic programming to learn interpretable relational rules, enabling contrastive explanations and interactive constraint-based evaluation. Across diverse datasets and CNN architectures, CoReX demonstrates high fidelity to the original models and gains human-preferred explanations when concept-relations are combined. This approach advances explainable and interactive AI by providing human-actionable, rule-based insights into model decisions and offering pathways for model refinement guided by domain knowledge.

Abstract

Explanations for Convolutional Neural Networks (CNNs) based on relevance of input pixels might be too unspecific to evaluate which and how input features impact model decisions. Especially in complex real-world domains like biology, the presence of specific concepts and of relations between concepts might be discriminating between classes. Pixel relevance is not expressive enough to convey this type of information. In consequence, model evaluation is limited and relevant aspects present in the data and influencing the model decisions might be overlooked. This work presents a novel method to explain and evaluate CNN models, which uses a concept- and relation-based explainer (CoReX). It explains the predictive behavior of a model on a set of images by masking (ir-)relevant concepts from the decision-making process and by constraining relations in a learned interpretable surrogate model. We test our approach with several image data sets and CNN architectures. Results show that CoReX explanations are faithful to the CNN model in terms of predictive outcomes. We further demonstrate through a human evaluation that CoReX is a suitable tool for generating combined explanations that help assessing the classification quality of CNNs. We further show that CoReX supports the identification and re-classification of incorrect or ambiguous classifications.

When a Relation Tells More Than a Concept: Exploring and Evaluating Classifier Decisions with CoReX

TL;DR

Abstract

Paper Structure (50 sections, 5 equations, 30 figures, 10 tables, 2 algorithms)

This paper contains 50 sections, 5 equations, 30 figures, 10 tables, 2 algorithms.

Introduction
Related Work
Concepts, Relations, and Disentangled Representations
Explanations for Interactive Learning
Background and Terminology
Extracting Visual Features with CRP
Learning Symbolic Hypotheses with ILP
Method
Building Background Knowledge from Concepts
Integration of Spatial Relational Knowledge
Model Truth and Explainer Truth
Algorithm
Experiments
Models
Data Sets
...and 35 more sections

Figures (30)

Figure 1: Explaining why a sample image belongs to the target class "teapot" by contrasting it with a sample from the contrastive class "vase" based on identified concepts (handle, spout) and their relations ("spout right of handle")
Figure 2: Overview of our CoReX approach for explaining and evaluating CNN image classifications with concept- and relation-based explanations and constraints (concept masking and relational constraints).
Figure 3: Overview of the process of generating explanations with ILP from extracted concepts and learned relations
Figure 4: Examples of contrastive classes
Figure 5: The generated CoReX explanations for Aleph and Popper, containing the concepts "handle", "spout" and "lid". Additionally, samples that are covered by the explanations are shown, with the concept polygon borders and relevance values highlighted in heatmaps.
...and 25 more figures

When a Relation Tells More Than a Concept: Exploring and Evaluating Classifier Decisions with CoReX

TL;DR

Abstract

When a Relation Tells More Than a Concept: Exploring and Evaluating Classifier Decisions with CoReX

Authors

TL;DR

Abstract

Table of Contents

Figures (30)