Table of Contents
Fetching ...

A Generalized Back-Door Criterion for Linear Regression

Masato Shimokawa

TL;DR

The paper generalizes covariate adjustment criteria to identify partial and total causal effects in linear DAG-based SEMs via the selective-door criterion, a unifying condition that extends back-door and single-door criteria. It demonstrates that partial regression coefficients can consistently identify causal effects not mediated by covariates under distribution-free, uncorrelated errors, and clarifies post-treatment bias through graphical sequences. It provides proofs, corollaries, and illustrative examples including SVAR-like settings, and outlines limitations to linear causal structures with potential future extension to nonlinearity. This work strengthens the theoretical basis for interpreting regression coefficients causally in applied settings while delineating the boundaries of its applicability.

Abstract

What assumptions about the data-generating process are required to permit a causal interpretation of partial regression coefficients? To answer this question, this paper generalizes Pearl's single-door and back-door criteria and proposes a new criterion that enables the identification of total or partial causal effects. In addition, this paper elucidates the mechanism of post-treatment bias, showing that a repeated sequence of nodes can be a potential source of this bias. The results apply to linear data-generating processes represented by directed acyclic graphs with distribution-free error terms.

A Generalized Back-Door Criterion for Linear Regression

TL;DR

The paper generalizes covariate adjustment criteria to identify partial and total causal effects in linear DAG-based SEMs via the selective-door criterion, a unifying condition that extends back-door and single-door criteria. It demonstrates that partial regression coefficients can consistently identify causal effects not mediated by covariates under distribution-free, uncorrelated errors, and clarifies post-treatment bias through graphical sequences. It provides proofs, corollaries, and illustrative examples including SVAR-like settings, and outlines limitations to linear causal structures with potential future extension to nonlinearity. This work strengthens the theoretical basis for interpreting regression coefficients causally in applied settings while delineating the boundaries of its applicability.

Abstract

What assumptions about the data-generating process are required to permit a causal interpretation of partial regression coefficients? To answer this question, this paper generalizes Pearl's single-door and back-door criteria and proposes a new criterion that enables the identification of total or partial causal effects. In addition, this paper elucidates the mechanism of post-treatment bias, showing that a repeated sequence of nodes can be a potential source of this bias. The results apply to linear data-generating processes represented by directed acyclic graphs with distribution-free error terms.

Paper Structure

This paper contains 12 sections, 11 theorems, 48 equations, 4 figures.

Key Result

Proposition 2.4

For the residual $\epsilon_{Y\bm{X}}$, the following properties hold:

Figures (4)

  • Figure 1: The causal diagram associated with equation \ref{['eq: eg1']}
  • Figure 2: The causal diagram associated with equation \ref{['eq: eg2']}
  • Figure 3: The causal diagram associated with equation \ref{['eq: eg3']}
  • Figure 4: The DAG associated with equation \ref{['eq: eg3bn']}

Theorems & Definitions (31)

  • Definition 2.2: Population partial regression coefficient
  • Definition 2.3: Population linear regression equation
  • Proposition 2.4
  • Proposition 2.5
  • Proposition 2.6
  • Definition 2.7: Causal diagram
  • Definition 2.8: Blocking
  • Definition 2.9: Back-door path
  • Definition 2.10: Back-door criterion
  • Definition 2.11: Selective-door criterion
  • ...and 21 more