Extracting Explanations, Justification, and Uncertainty from Black-Box Deep Neural Networks

Paul Ardis; Arjuna Flenner

Extracting Explanations, Justification, and Uncertainty from Black-Box Deep Neural Networks

Paul Ardis, Arjuna Flenner

TL;DR

This paper proposes a novel Bayesian approach to extract explanations, justifications, and uncertainty estimates from DNNs, which is efficient both in terms of memory and computation, and can be applied to any black box DNN without any retraining.

Abstract

Deep Neural Networks (DNNs) do not inherently compute or exhibit empirically-justified task confidence. In mission critical applications, it is important to both understand associated DNN reasoning and its supporting evidence. In this paper, we propose a novel Bayesian approach to extract explanations, justifications, and uncertainty estimates from DNNs. Our approach is efficient both in terms of memory and computation, and can be applied to any black box DNN without any retraining, including applications to anomaly detection and out-of-distribution detection tasks. We validate our approach on the CIFAR-10 dataset, and show that it can significantly improve the interpretability and reliability of DNNs.

Extracting Explanations, Justification, and Uncertainty from Black-Box Deep Neural Networks

TL;DR

Abstract

Paper Structure (6 sections, 9 equations, 5 figures)

This paper contains 6 sections, 9 equations, 5 figures.

Introduction
Advantages of our approach
Background: Gaussian Processes
Method
Experiments
Conclusions

Figures (5)

Figure 1: An illustration of the example-based XAI approach of Virani et al. ViraniIY20. Their approach uses the transformation spaces from intermittent layers of a pre-trained DNN to evaluate the model’s justification and builds a support neighborhood around each test sample's transformed point.
Figure 2: An illustration of the proposed example-based XAI approach.
Figure 3: Label accuracy for CIFAR-10
Figure 4: Epistemic operation for CIFAR-10
Figure 5: Inference time for CIFAR-10

Extracting Explanations, Justification, and Uncertainty from Black-Box Deep Neural Networks

TL;DR

Abstract

Extracting Explanations, Justification, and Uncertainty from Black-Box Deep Neural Networks

Authors

TL;DR

Abstract

Table of Contents

Figures (5)