Interpreting Deep Neural Networks with the Package innsight

Niklas Koenen; Marvin N. Wright

Interpreting Deep Neural Networks with the Package innsight

Niklas Koenen, Marvin N. Wright

TL;DR

This work introduces innsight, an R package that unifies major neural-network feature attribution methods into a single, library- and backend-agnostic workflow. By converting diverse models into a torch-backed representation, it enables gradient-based, LRP, DeepLift, DeepSHAP, and model-agnostic explanations with a consistent three-step pipeline from model to visualization, including tabular, signal, and image data support and interactive plots. The authors validate that innsight’s attributions closely match Python counterparts (captum, zennit, innvestigate, deeplift, shap) within negligible numerical differences, and demonstrate practical use on penguin and melanoma datasets. They also discuss runtime characteristics, limitations (CPU-only, sequential Torch models), and potential extensions (permutation-based methods, broader model support). Overall, innsight provides efficient, accessible, end-to-end interpretability in R, widening the reach of neural network explanations to researchers and practitioners in the R ecosystem.

Abstract

The R package innsight offers a general toolbox for revealing variable-wise interpretations of deep neural networks' predictions with so-called feature attribution methods. Aside from the unified and user-friendly framework, the package stands out in three ways: It is generally the first R package implementing feature attribution methods for neural networks. Secondly, it operates independently of the deep learning library allowing the interpretation of models from any R package, including keras, torch, neuralnet, and even custom models. Despite its flexibility, innsight benefits internally from the torch package's fast and efficient array calculations, which builds on LibTorch $-$ PyTorch's C++ backend $-$ without a Python dependency. Finally, it offers a variety of visualization tools for tabular, signal, image data or a combination of these. Additionally, the plots can be rendered interactively using the plotly package.

Interpreting Deep Neural Networks with the Package innsight

TL;DR

Abstract

PyTorch's C++ backend

without a Python dependency. Finally, it offers a variety of visualization tools for tabular, signal, image data or a combination of these. Additionally, the plots can be rendered interactively using the plotly package.

Paper Structure (9 sections, 9 equations, 5 figures)

This paper contains 9 sections, 9 equations, 5 figures.

Introduction
Methodology of feature attribution
Gradient-based methods
Layer-wise relevance propagation (LRP)
Deep learning important features (DeepLift)
Connection weights
Choice of the method
Functionality and usage
Step 1 - Convert the model

Figures (5)

Figure 1: General procedure of feature attribution methods: First, an input instance $\bm{x}$ flows through the model $f$ to obtain a prediction $\bm{\hat{y}}$. Then, the desired output node or class $\hat{y}_c$ to be explained is selected. Finally, the relevance $R_i^c$ of the individual input variables $i$ at the selected output $c$ is calculated in a backward pass.
Figure 2: A summary of gradient-based feature attribution methods, including their mathematical representation. They are divided into blocks based on their underlying objectives. For example, in the case of feature-wise relevances $R_i^c$ obtained from Gradient$\times$Input, the goal is to achieve a sum that equals $f(\bm{x})_c$, i.e., $\sum_{i = 1}^p R_i^c = f(\bm{x})_c$.
Figure 3: (a) illustrates the layer-by-layer backpropagation of relevances $R_i^l$ from the prediction score to the input variables through the use of relevance messages $r_{i \leftarrow j}$. For a hidden layer, (b) demonstrates how the relevance of the lower layer $l$ results from summing all incoming relevance messages.
Figure 4: innsight utilizes the package torch, which builds directly on the C++ library LibTorch without a Python dependency.
Figure 5: (a) displays the visualizations of the plot() and boxplot() functions applied to the DeepSHAP method on the bike sharing dataset. In (b), the internal conversion process of creating a new Converter object is shown, which is identical to calling the shortcut function convert().

Interpreting Deep Neural Networks with the Package innsight

TL;DR

Abstract

Interpreting Deep Neural Networks with the Package innsight

Authors

TL;DR

Abstract

Table of Contents

Figures (5)