Table of Contents
Fetching ...

Linking Model Intervention to Causal Interpretation in Model Explanation

Debo Cheng, Ziqi Xu, Jiuyong Li, Lin Liu, Kui Yu, Thuc Duy Le, Jixue Liu

TL;DR

The conditions when an intuitive model intervention effect has a causal interpretation, i.e., when it indicates whether a feature is a direct cause of the outcome are studied.

Abstract

Intervention intuition is often used in model explanation where the intervention effect of a feature on the outcome is quantified by the difference of a model prediction when the feature value is changed from the current value to the baseline value. Such a model intervention effect of a feature is inherently association. In this paper, we will study the conditions when an intuitive model intervention effect has a causal interpretation, i.e., when it indicates whether a feature is a direct cause of the outcome. This work links the model intervention effect to the causal interpretation of a model. Such an interpretation capability is important since it indicates whether a machine learning model is trustworthy to domain experts. The conditions also reveal the limitations of using a model intervention effect for causal interpretation in an environment with unobserved features. Experiments on semi-synthetic datasets have been conducted to validate theorems and show the potential for using the model intervention effect for model interpretation.

Linking Model Intervention to Causal Interpretation in Model Explanation

TL;DR

The conditions when an intuitive model intervention effect has a causal interpretation, i.e., when it indicates whether a feature is a direct cause of the outcome are studied.

Abstract

Intervention intuition is often used in model explanation where the intervention effect of a feature on the outcome is quantified by the difference of a model prediction when the feature value is changed from the current value to the baseline value. Such a model intervention effect of a feature is inherently association. In this paper, we will study the conditions when an intuitive model intervention effect has a causal interpretation, i.e., when it indicates whether a feature is a direct cause of the outcome. This work links the model intervention effect to the causal interpretation of a model. Such an interpretation capability is important since it indicates whether a machine learning model is trustworthy to domain experts. The conditions also reveal the limitations of using a model intervention effect for causal interpretation in an environment with unobserved features. Experiments on semi-synthetic datasets have been conducted to validate theorems and show the potential for using the model intervention effect for model interpretation.

Paper Structure

This paper contains 24 sections, 8 theorems, 5 figures, 7 tables.

Key Result

Theorem 1

Suppose that Assumptions assump-Data-graph-faithfulness and assump-Data-model-faithfulness hold, and there are no unobserved variables. If $\mathbf{X}$ includes no descendant variables of $Y$, then $\forall X_i\in\mathbf{X}$, $\operatorname{AMIE}(X_i, Y) \ne 0$ if and only if $X_i$ is a direct cause

Figures (5)

  • Figure 1: The DAG representing the data generation mechanism assumed in this paper.
  • Figure 2: Four exemplar causal DAGs illustrate three false positive cases in Theorem \ref{['theorem_false01']}. Specifically: (a) Case 1, $X_j$ is a parent of $X_i$, and $X_i$ is a proxy of the unobserved direct cause $U_i$; (b) Case 2, $X_j$ shares a common unobserved confounder $U_j$ with $X_i$ which is a proxy of the unobserved direct cause $U_i$; (c) Case 3, the inducing path between $X_j$ and $Y$, where $X_m$ is a collider and a cause of $Y$; (d) Case 3, the relaxed inducing path between $X_j$ and $Y$ where $X_m$ is a collider and a cause of $Y$.
  • Figure 3: Four exemplar causal DAGs used in the discussions in Section \ref{['sec_summary']}.
  • Figure 4: The Bayesian network for Insurance.
  • Figure 5: The Bayesian network for Water.

Theorems & Definitions (16)

  • Definition 1: Model Intervention Effects (MIE) and the average MIE (AMIE)
  • Theorem 1: Linking AMIEs and direct causes of $Y$ without the presence of unobserved variables
  • Theorem 2: Linking AMIE with direct causes in the presence of unobserved variables
  • Definition 2: Inducing Paths and Relaxed Inducing Paths
  • Theorem 3: Cases for false linkages of AMIEs with causality
  • Theorem 4: A test for false linkages
  • Definition 3: d-separation pearl2009causality
  • Definition 4: Causal Sufficiency spirtes2000causation
  • Theorem 1: Linking AMIEs and direct causes of $Y$ without the presence of unobserved variables
  • proof
  • ...and 6 more