Table of Contents
Fetching ...

BayesNAM: Leveraging Inconsistency for Reliable Explanations

Hoki Kim, Jinseong Park, Yujin Choi, Seungyun Lee, Jaewook Lee

TL;DR

A novel framework, Bayesian Neural Additive Model (BayesNAM), is introduced, which integrates Bayesian neural networks and feature dropout, with theoretical proof demonstrating that feature dropout effectively captures model inconsistencies.

Abstract

Neural additive model (NAM) is a recently proposed explainable artificial intelligence (XAI) method that utilizes neural network-based architectures. Given the advantages of neural networks, NAMs provide intuitive explanations for their predictions with high model performance. In this paper, we analyze a critical yet overlooked phenomenon: NAMs often produce inconsistent explanations, even when using the same architecture and dataset. Traditionally, such inconsistencies have been viewed as issues to be resolved. However, we argue instead that these inconsistencies can provide valuable explanations within the given data model. Through a simple theoretical framework, we demonstrate that these inconsistencies are not mere artifacts but emerge naturally in datasets with multiple important features. To effectively leverage this information, we introduce a novel framework, Bayesian Neural Additive Model (BayesNAM), which integrates Bayesian neural networks and feature dropout, with theoretical proof demonstrating that feature dropout effectively captures model inconsistencies. Our experiments demonstrate that BayesNAM effectively reveals potential problems such as insufficient data or structural limitations of the model, providing more reliable explanations and potential remedies.

BayesNAM: Leveraging Inconsistency for Reliable Explanations

TL;DR

A novel framework, Bayesian Neural Additive Model (BayesNAM), is introduced, which integrates Bayesian neural networks and feature dropout, with theoretical proof demonstrating that feature dropout effectively captures model inconsistencies.

Abstract

Neural additive model (NAM) is a recently proposed explainable artificial intelligence (XAI) method that utilizes neural network-based architectures. Given the advantages of neural networks, NAMs provide intuitive explanations for their predictions with high model performance. In this paper, we analyze a critical yet overlooked phenomenon: NAMs often produce inconsistent explanations, even when using the same architecture and dataset. Traditionally, such inconsistencies have been viewed as issues to be resolved. However, we argue instead that these inconsistencies can provide valuable explanations within the given data model. Through a simple theoretical framework, we demonstrate that these inconsistencies are not mere artifacts but emerge naturally in datasets with multiple important features. To effectively leverage this information, we introduce a novel framework, Bayesian Neural Additive Model (BayesNAM), which integrates Bayesian neural networks and feature dropout, with theoretical proof demonstrating that feature dropout effectively captures model inconsistencies. Our experiments demonstrate that BayesNAM effectively reveals potential problems such as insufficient data or structural limitations of the model, providing more reliable explanations and potential remedies.

Paper Structure

This paper contains 11 sections, 3 theorems, 40 equations, 15 figures, 5 tables.

Key Result

Lemma 1

(Derived from tsipras2018robustness) Consider a linear classifier $h$, Then, even a natural linear classifier, $h(\cdot)$ with $w_i= \frac{1}{d-1}$, can easily achieve a higher classification accuracy than $p$, which is a natural accuracy of the model that only uses $x_1$, if the following statement is satisfied: where $\Phi_X (\cdot)$ is the cumulative distribution function of $X$. (Detailed pr

Figures (15)

  • Figure 1: Inconsistency of NAM, where two independent NAMs trained with the same dataset and architecture output different explanations solely due to different random seeds.
  • Figure 2: Example of a mapping function $f_i$ of NAM. Blue regions correspond to regions with high data density. NAM enables us to capture non-linear relationships between inputs and outputs and further provide a clear understanding.
  • Figure 3: (Case-I. $\lambda=0$) Mapping functions of two NAMs trained with different random seeds show similar shapes. Blue regions correspond to regions with high data density.
  • Figure 4: (Case-II. $\lambda=3$) Mapping functions of two NAMs trained with different random seeds are extremely different. This yields inconsistent feature contribution in Fig. \ref{['fig:inconsistent']}.
  • Figure 5: Corresponding feature contributions of a sample ${{\bm{x}}}=[-1, 3, 3]$ with $y=1$ for NAMs in Fig. \ref{['fig:lam3']}. This inconsistent explanation corresponds to Fig. \ref{['fig:intro']}.
  • ...and 10 more figures

Theorems & Definitions (7)

  • Lemma 1
  • Theorem 1
  • proof : Sketch of proof
  • proof
  • Lemma 2
  • proof
  • proof