Formal Abductive Latent Explanations for Prototype-Based Networks

Jules Soria; Zakaria Chihani; Julien Girard-Satabin; Alban Grastien; Romain Xu-Darme; Daniela Cancila

Formal Abductive Latent Explanations for Prototype-Based Networks

Jules Soria, Zakaria Chihani, Julien Girard-Satabin, Alban Grastien, Romain Xu-Darme, Daniela Cancila

TL;DR

Prototype-based networks often rely on intuitive prototype explanations that can be misleading. This work introduces Abductive Latent Explanations (ALE), a formal FXAI framework that defines latent-space preconditions $\phi_{\mathcal{E}}$ over $f(\mathbf{x})$ to guarantee the prediction, and provides solver-free, geometry-based algorithms (domain/spatial constraints, triangular inequality, and hypersphere intersection) to compute verifiable ALEs in the latent space. Across multiple datasets and backbones, ALEs reveal that achieving formal guarantees typically requires large, many-prototype explanations, with larger ALEs correlating with incorrect predictions and potentially signaling uncertainty. These findings improve explanation reliability for safety-critical use and highlight fundamental trade-offs between interpretability and fidelity in prototype-based models.

Abstract

Case-based reasoning networks are machine-learning models that make predictions based on similarity between the input and prototypical parts of training samples, called prototypes. Such models are able to explain each decision by pointing to the prototypes that contributed the most to the final outcome. As the explanation is a core part of the prediction, they are often qualified as ``interpretable by design". While promising, we show that such explanations are sometimes misleading, which hampers their usefulness in safety-critical contexts. In particular, several instances may lead to different predictions and yet have the same explanation. Drawing inspiration from the field of formal eXplainable AI (FXAI), we propose Abductive Latent Explanations (ALEs), a formalism to express sufficient conditions on the intermediate (latent) representation of the instance that imply the prediction. Our approach combines the inherent interpretability of case-based reasoning models and the guarantees provided by formal XAI. We propose a solver-free and scalable algorithm for generating ALEs based on three distinct paradigms, compare them, and present the feasibility of our approach on diverse datasets for both standard and fine-grained image classification. The associated code can be found at https://github.com/julsoria/ale

Formal Abductive Latent Explanations for Prototype-Based Networks

TL;DR

Abstract

Formal Abductive Latent Explanations for Prototype-Based Networks

TL;DR

Abstract

Paper Structure

Table of Contents

Key Result

Figures (2)

Theorems & Definitions (15)