Using the Path of Least Resistance to Explain Deep Networks
Sina Salek, Joseph Enguehard
TL;DR
This work identifies a fundamental flaw in traditional Integrated Gradients arising from straight-path integrations in Euclidean space, and proposes Geodesic Integrated Gradients (GIG) to attribute via gradients integrated along geodesics on a model-defined Riemannian manifold with metric $G_x = J_x^T J_x$. It provides two practical geodesic-approximation strategies—$k$NN graph-based shortest paths for small models and energy-based sampling via Stochastic Variational Inference for larger ones—along with a new Strong Completeness axiom showing GIG uniquely satisfies this property among path-based methods. Empirically, GIG outperforms IG and other explainers on synthetic half-moon data and real-world Pascal VOC 2012 images, with the $k$NN variant delivering robust performance and SVI offering deeper exploration at the cost of tuning and computation. The approach offers more faithful attributions by accounting for the gradient landscape and curvature of the input space, enabling more reliable interpretations in safety- and bias-sensitive applications, albeit with substantial computational demands and hyperparameter considerations.
Abstract
Integrated Gradients (IG), a widely used axiomatic path-based attribution method, assigns importance scores to input features by integrating model gradients along a straight path from a baseline to the input. While effective in some cases, we show that straight paths can lead to flawed attributions. In this paper, we identify the cause of these misattributions and propose an alternative approach that treats the input space as a Riemannian manifold, computing attributions by integrating gradients along geodesics. We call this method Geodesic Integrated Gradients (GIG). To approximate geodesic paths, we introduce two techniques: a k-Nearest Neighbours-based approach for smaller models and a Stochastic Variational Inference-based method for larger ones. Additionally, we propose a new axiom, Strong Completeness, extending the axioms satisfied by IG. We show that this property is desirable for attribution methods and that GIG is the only method that satisfies it. Through experiments on both synthetic and real-world data, we demonstrate that GIG outperforms existing explainability methods, including IG.
