Table of Contents
Fetching ...

Using the Path of Least Resistance to Explain Deep Networks

Sina Salek, Joseph Enguehard

TL;DR

This work identifies a fundamental flaw in traditional Integrated Gradients arising from straight-path integrations in Euclidean space, and proposes Geodesic Integrated Gradients (GIG) to attribute via gradients integrated along geodesics on a model-defined Riemannian manifold with metric $G_x = J_x^T J_x$. It provides two practical geodesic-approximation strategies—$k$NN graph-based shortest paths for small models and energy-based sampling via Stochastic Variational Inference for larger ones—along with a new Strong Completeness axiom showing GIG uniquely satisfies this property among path-based methods. Empirically, GIG outperforms IG and other explainers on synthetic half-moon data and real-world Pascal VOC 2012 images, with the $k$NN variant delivering robust performance and SVI offering deeper exploration at the cost of tuning and computation. The approach offers more faithful attributions by accounting for the gradient landscape and curvature of the input space, enabling more reliable interpretations in safety- and bias-sensitive applications, albeit with substantial computational demands and hyperparameter considerations.

Abstract

Integrated Gradients (IG), a widely used axiomatic path-based attribution method, assigns importance scores to input features by integrating model gradients along a straight path from a baseline to the input. While effective in some cases, we show that straight paths can lead to flawed attributions. In this paper, we identify the cause of these misattributions and propose an alternative approach that treats the input space as a Riemannian manifold, computing attributions by integrating gradients along geodesics. We call this method Geodesic Integrated Gradients (GIG). To approximate geodesic paths, we introduce two techniques: a k-Nearest Neighbours-based approach for smaller models and a Stochastic Variational Inference-based method for larger ones. Additionally, we propose a new axiom, Strong Completeness, extending the axioms satisfied by IG. We show that this property is desirable for attribution methods and that GIG is the only method that satisfies it. Through experiments on both synthetic and real-world data, we demonstrate that GIG outperforms existing explainability methods, including IG.

Using the Path of Least Resistance to Explain Deep Networks

TL;DR

This work identifies a fundamental flaw in traditional Integrated Gradients arising from straight-path integrations in Euclidean space, and proposes Geodesic Integrated Gradients (GIG) to attribute via gradients integrated along geodesics on a model-defined Riemannian manifold with metric . It provides two practical geodesic-approximation strategies—NN graph-based shortest paths for small models and energy-based sampling via Stochastic Variational Inference for larger ones—along with a new Strong Completeness axiom showing GIG uniquely satisfies this property among path-based methods. Empirically, GIG outperforms IG and other explainers on synthetic half-moon data and real-world Pascal VOC 2012 images, with the NN variant delivering robust performance and SVI offering deeper exploration at the cost of tuning and computation. The approach offers more faithful attributions by accounting for the gradient landscape and curvature of the input space, enabling more reliable interpretations in safety- and bias-sensitive applications, albeit with substantial computational demands and hyperparameter considerations.

Abstract

Integrated Gradients (IG), a widely used axiomatic path-based attribution method, assigns importance scores to input features by integrating model gradients along a straight path from a baseline to the input. While effective in some cases, we show that straight paths can lead to flawed attributions. In this paper, we identify the cause of these misattributions and propose an alternative approach that treats the input space as a Riemannian manifold, computing attributions by integrating gradients along geodesics. We call this method Geodesic Integrated Gradients (GIG). To approximate geodesic paths, we introduce two techniques: a k-Nearest Neighbours-based approach for smaller models and a Stochastic Variational Inference-based method for larger ones. Additionally, we propose a new axiom, Strong Completeness, extending the axioms satisfied by IG. We show that this property is desirable for attribution methods and that GIG is the only method that satisfies it. Through experiments on both synthetic and real-world data, we demonstrate that GIG outperforms existing explainability methods, including IG.

Paper Structure

This paper contains 17 sections, 1 theorem, 25 equations, 10 figures, 2 tables.

Key Result

Theorem 1

Let $f:\mathbb{R}^n\to\mathbb{R}$ be continuously differentiable, and let $\overline{\mathbf{x}},\, \mathbf{x}\in\mathbb{R}^n$. Given a smooth path define its attributions as and assume the Riemannian metric is given by Eq. eq:inner_product. The length of a path is given by Eq. eq:length. Suppose that the geodesic path connecting $\overline{\mathbf{x}}$ and $\mathbf{x}$ exists. Then, if and onl

Figures (10)

  • Figure 1: Comparison of attributions generated by Integrated Gradients (middle figure) and Geodesic Integrated Gradients (right figure) for image classification with a ConvNext model. Integrated Gradients follow straight paths in Euclidean space, which can result in misleading attributions. In contrast, Geodesic Integrated Gradients integrate along geodesic paths on a Riemannian manifold defined by the model, correcting misattributions caused by poor alignment with the model's gradient landscape. In both cases, the baseline is a black image. For IG, since the jets are black, apart from the artefacts created outside of the boundaries of the jets, the attribution method is misled into considering the jets unimportant for classification—despite the fact that they are the objects being classified. Geodesic IG does not suffer from this issue. Further examples of such misattributions due to black segments in images are shown in Appendix \ref{['app:voc']}.
  • Figure 2: Metrics Comparison. We use a ConvNext model to classify images from the VOC dataset. The horizontal axis represents the top k% (in absolute value) of selected features. The left plot, Comprehensiveness, shows the average change in the predicted class probability compared to the original image (higher is better). The right plot displays the log-odds (lower is better). In both cases, results are summarised using AUC, where higher values indicate better performance. Geodesic IG significantly outperforms other methods in both metrics. See Section \ref{['sec:experiments']} for details of the experiments.
  • Figure 3: Integrated Gradients (IG) attributions (top) vs. Geodesic IG (bottom). We plot scatter plots of 10,000 samples from the half-moons dataset with noise parameter $\mathcal{N}(0, 0.15)$. An MLP model is trained for classification, and the model gradients are shown as contour maps. The model is nearly flat everywhere except at the decision boundary. Using a baseline at $(-0.5, -0.5)$, we compute IG and Geodesic IG attributions. From left to right, the colour maps display (a) feature attributions along the horizontal axis, (b) feature attributions along the vertical axis, (c) the absolute sum of attributions, $\sum_i |A_i(\mathbf{x})|$, and (d) the total sum of attributions, $\sum_i A_i(\mathbf{x})$. According to Axioms \ref{['ax:complete']} and \ref{['ax:strong']}, the heatmaps in the last two columns should resemble those in Fig. \ref{['fig:outcome_diff']}. As shown, IG satisfies Axiom \ref{['ax:complete']} (last column) but not Axiom \ref{['ax:strong']} (penultimate column). In contrast, Geodesic IG satisfies both. Additionally, similar to Fig. \ref{['fig:duck']}, IG is highly sensitive to the choice of baseline due to its reliance on a straight-line path, whereas Geodesic IG mitigates this sensitivity.
  • Figure 4: Model output at the baseline vs. input points. To assess whether our attribution methods satisfy Axioms \ref{['ax:complete']} and \ref{['ax:strong']} in the half-moons example, we plot the model output at the input points, subtracting the model output at the baseline. The left plot shows this difference, $f(\textbf{x}) - f(\overline{\textbf{x}})$, while the right plot shows the absolute difference, $|f(\textbf{x}) - f(\overline{\textbf{x}})|$. Comparing these plots with those in Fig. \ref{['fig:ig']}, we observe that Geodesic IG satisfies both axioms, whereas IG satisfies Completeness, only.
  • Figure 5: Method overview. For an input $\textbf{x}$, a baseline $\overline{\textbf{x}}$, and a set of points $\textbf{x}_i$, we compute the $k$NN graph using the euclidean distance (dashed lines). For each couple $(\textbf{x}_i, \textbf{x}_j)$, we then compute the integrated gradients $\textrm{L}^*_{ij}$ using Equation \ref{['eq:log_geo_approx']}. For clarity, not all $\textrm{L}^*_{ij}$ are present on the figure. 0 and 7 represent $\overline{\textbf{x}}$ and $\textbf{x}$ respectively. Using the resulting undirected weighted graph, we use the Dijkstra algorithm to find the shortest path between $\textbf{x}$ and $\overline{\textbf{x}}$ (blue continuous lines). On the left, the points $\textbf{x}_i$ are provided while, on the right, the points are generated along the straight line between $\textbf{x}$ and $\overline{\textbf{x}}$ (dotted line).
  • ...and 5 more figures

Theorems & Definitions (3)

  • Remark 1
  • Theorem 1: Strong Completeness
  • Proof 1