Table of Contents
Fetching ...

Tangentially Aligned Integrated Gradients for User-Friendly Explanations

Lachlan Simpson, Federico Costanza, Kyle Millar, Adriel Cheng, Cheng-Chew Lim, Hong Gunn Chew

TL;DR

This work addresses the base-point sensitivity of Integrated Gradients by introducing tangential alignment, a criterion grounded in the manifold hypothesis that seeks explanations lying in the tangent space $T_{x}M$. It derives theoretical conditions via $H_{x}$ and $E_{x}$, provides a practical optimization to find the optimal base-point $\alpha_x^*$, and demonstrates Tangential IG on four image datasets using a tangent-space estimate from a convolutional autoencoder. Compared to standard base-points and three gradient explainability models, Tangential IG yields higher tangential-content $\mu_x$ and more perceptually meaningful attributions. The approach is general to any base-point attribution method and advances user-friendly, robust explanations for neural networks in vision tasks.

Abstract

Integrated gradients is prevalent within machine learning to address the black-box problem of neural networks. The explanations given by integrated gradients depend on a choice of base-point. The choice of base-point is not a priori obvious and can lead to drastically different explanations. There is a longstanding hypothesis that data lies on a low dimensional Riemannian manifold. The quality of explanations on a manifold can be measured by the extent to which an explanation for a point lies in its tangent space. In this work, we propose that the base-point should be chosen such that it maximises the tangential alignment of the explanation. We formalise the notion of tangential alignment and provide theoretical conditions under which a base-point choice will provide explanations lying in the tangent space. We demonstrate how to approximate the optimal base-point on several well-known image classification datasets. Furthermore, we compare the optimal base-point choice with common base-points and three gradient explainability models.

Tangentially Aligned Integrated Gradients for User-Friendly Explanations

TL;DR

This work addresses the base-point sensitivity of Integrated Gradients by introducing tangential alignment, a criterion grounded in the manifold hypothesis that seeks explanations lying in the tangent space . It derives theoretical conditions via and , provides a practical optimization to find the optimal base-point , and demonstrates Tangential IG on four image datasets using a tangent-space estimate from a convolutional autoencoder. Compared to standard base-points and three gradient explainability models, Tangential IG yields higher tangential-content and more perceptually meaningful attributions. The approach is general to any base-point attribution method and advances user-friendly, robust explanations for neural networks in vision tasks.

Abstract

Integrated gradients is prevalent within machine learning to address the black-box problem of neural networks. The explanations given by integrated gradients depend on a choice of base-point. The choice of base-point is not a priori obvious and can lead to drastically different explanations. There is a longstanding hypothesis that data lies on a low dimensional Riemannian manifold. The quality of explanations on a manifold can be measured by the extent to which an explanation for a point lies in its tangent space. In this work, we propose that the base-point should be chosen such that it maximises the tangential alignment of the explanation. We formalise the notion of tangential alignment and provide theoretical conditions under which a base-point choice will provide explanations lying in the tangent space. We demonstrate how to approximate the optimal base-point on several well-known image classification datasets. Furthermore, we compare the optimal base-point choice with common base-points and three gradient explainability models.

Paper Structure

This paper contains 12 sections, 4 theorems, 26 equations, 2 figures.

Key Result

Theorem 1

Let $A : M \times M \times \mathcal{F}(M) \to \mathbb{R}^{d}$ be a BAM and $x, \alpha \in M$, $F \in \mathcal{F}(M)$. Then $A$ is tangentially aligned at $x$, with base-point $\alpha$, if and only if $H_{x}(\alpha) = 0$ or, equivalently, if $E_{x}(\alpha) = 0$.

Figures (2)

  • Figure 1: Kernel density estimate plot of the fraction of the explanation in the tangent space with (a) different base-point choices and (b) different gradient explainability models. The fraction of explanation in the tangent space is measured with $\mu_{x}$ (Equation \ref{['def:mu']}). The vertical line represents the expected fraction a random vector lies in the tangent space $\approx \sqrt{n/d}$, where $n =\dim(T_{x}M)$ and $d = \dim(M)$. On CIFAR10 and FER2013, $n = 144$. On MNIST32 and Fashion-MNIST, $n = 10$.
  • Figure 2: Attributions of IG with differing base-point choice on example points from MNIST, FER2013, and Fashion-MNIST. The fraction of explanation in the tangent space is denoted by $\mu_{x}$ (Equation \ref{['def:mu']}).

Theorems & Definitions (7)

  • Definition 1
  • Theorem 1
  • proof
  • Corollary 2
  • proof
  • Lemma 3
  • Theorem 4