Table of Contents
Fetching ...

Neural Networks with Causal Graph Constraints: A New Approach for Treatment Effects Estimation

Roger Pros, Jordi Vitrià

TL;DR

NN-CGC addresses heterogeneous treatment effect estimation by introducing a neural-network inductive bias that constrains learning to respect a given causal graph. It constructs groups $G_{x_i}$ from the graph and restricts the learned distribution to causally valid interactions, enabling integration with existing heads such as TARNet, Dragonnet, or BCAUSS. Across synthetic, semi-synthetic IHDP, and real Jobs datasets, NN-CGC achieves state-of-the-art performance in $PEHE$ and $ATE$, with robustness to imperfect graphs and benefits from partial causal information. The approach is modular, can be stacked with other biases, and points to future enhancements like masking and a graphical conditioner for further efficiency and accuracy.

Abstract

In recent years, there has been a growing interest in using machine learning techniques for the estimation of treatment effects. Most of the best-performing methods rely on representation learning strategies that encourage shared behavior among potential outcomes to increase the precision of treatment effect estimates. In this paper we discuss and classify these models in terms of their algorithmic inductive biases and present a new model, NN-CGC, that considers additional information from the causal graph. NN-CGC tackles bias resulting from spurious variable interactions by implementing novel constraints on models, and it can be integrated with other representation learning methods. We test the effectiveness of our method using three different base models on common benchmarks. Our results indicate that our model constraints lead to significant improvements, achieving new state-of-the-art results in treatment effects estimation. We also show that our method is robust to imperfect causal graphs and that using partial causal information is preferable to ignoring it.

Neural Networks with Causal Graph Constraints: A New Approach for Treatment Effects Estimation

TL;DR

NN-CGC addresses heterogeneous treatment effect estimation by introducing a neural-network inductive bias that constrains learning to respect a given causal graph. It constructs groups from the graph and restricts the learned distribution to causally valid interactions, enabling integration with existing heads such as TARNet, Dragonnet, or BCAUSS. Across synthetic, semi-synthetic IHDP, and real Jobs datasets, NN-CGC achieves state-of-the-art performance in and , with robustness to imperfect graphs and benefits from partial causal information. The approach is modular, can be stacked with other biases, and points to future enhancements like masking and a graphical conditioner for further efficiency and accuracy.

Abstract

In recent years, there has been a growing interest in using machine learning techniques for the estimation of treatment effects. Most of the best-performing methods rely on representation learning strategies that encourage shared behavior among potential outcomes to increase the precision of treatment effect estimates. In this paper we discuss and classify these models in terms of their algorithmic inductive biases and present a new model, NN-CGC, that considers additional information from the causal graph. NN-CGC tackles bias resulting from spurious variable interactions by implementing novel constraints on models, and it can be integrated with other representation learning methods. We test the effectiveness of our method using three different base models on common benchmarks. Our results indicate that our model constraints lead to significant improvements, achieving new state-of-the-art results in treatment effects estimation. We also show that our method is robust to imperfect causal graphs and that using partial causal information is preferable to ignoring it.
Paper Structure (26 sections, 2 equations, 2 figures, 7 tables, 1 algorithm)

This paper contains 26 sections, 2 equations, 2 figures, 7 tables, 1 algorithm.

Figures (2)

  • Figure 1: Illustrative example of groups of variables allowed to interact between each other. In this setting, the allowed interactions are: (X1, X2, X4), (X3, X1), (X5, X3, X4). In this specific case, $G_{x4}$ is equal to $G_T$.
  • Figure 2: Model architecture when applying CGC to the Dragonnet. The post-representation part remains identical but the pre-representation layers are divided according to the groups of variables.