Table of Contents
Fetching ...

An Analysis of Causal Effect Estimation using Outcome Invariant Data Augmentation

Uzair Akbar, Niki Kilbertus, Hao Shen, Krikamol Muandet, Bo Dai

TL;DR

The work addresses causal effect estimation from observational data by reinterpreting data augmentation (DA) as a pathway to simulate interventions on the treatment mechanism when the outcome is invariant to the augmentation. It introduces IV-like (IVL) regression to relax IV assumptions while still mitigating confounding, and shows that composing parameterized DA with IVL can realize worst-case, adversarial interventions that further reduce bias. The authors provide population-level theory in linear SEMs, supported by finite-sample simulations and real-data experiments (optical-device and Colored MNIST) demonstrating improved causal estimates and robust prediction across interventions. The approach broadens the accessibility of causal inference by leveraging commonly available DA in the absence of valid IVs, while acknowledging the need for domain knowledge and careful tuning of regularization parameters. Overall, outcome-invariant DA combined with IVL regression offers a principled, practically grounded framework for reducing confounding bias and enhancing generalization across treatment interventions.

Abstract

The technique of data augmentation (DA) is often used in machine learning for regularization purposes to better generalize under i.i.d. settings. In this work, we present a unifying framework with topics in causal inference to make a case for the use of DA beyond just the i.i.d. setting, but for generalization across interventions as well. Specifically, we argue that when the outcome generating mechanism is invariant to our choice of DA, then such augmentations can effectively be thought of as interventions on the treatment generating mechanism itself. This can potentially help to reduce bias in causal effect estimation arising from hidden confounders. In the presence of such unobserved confounding we typically make use of instrumental variables (IVs) -- sources of treatment randomization that are conditionally independent of the outcome. However, IVs may not be as readily available as DA for many applications, which is the main motivation behind this work. By appropriately regularizing IV based estimators, we introduce the concept of IV-like (IVL) regression for mitigating confounding bias and improving predictive performance across interventions even when certain IV properties are relaxed. Finally, we cast parameterized DA as an IVL regression problem and show that when used in composition can simulate a worst-case application of such DA, further improving performance on causal estimation and generalization tasks beyond what simple DA may offer. This is shown both theoretically for the population case and via simulation experiments for the finite sample case using a simple linear example. We also present real data experiments to support our case.

An Analysis of Causal Effect Estimation using Outcome Invariant Data Augmentation

TL;DR

The work addresses causal effect estimation from observational data by reinterpreting data augmentation (DA) as a pathway to simulate interventions on the treatment mechanism when the outcome is invariant to the augmentation. It introduces IV-like (IVL) regression to relax IV assumptions while still mitigating confounding, and shows that composing parameterized DA with IVL can realize worst-case, adversarial interventions that further reduce bias. The authors provide population-level theory in linear SEMs, supported by finite-sample simulations and real-data experiments (optical-device and Colored MNIST) demonstrating improved causal estimates and robust prediction across interventions. The approach broadens the accessibility of causal inference by leveraging commonly available DA in the absence of valid IVs, while acknowledging the need for domain knowledge and careful tuning of regularization parameters. Overall, outcome-invariant DA combined with IVL regression offers a principled, practically grounded framework for reducing confounding bias and enhancing generalization across treatment interventions.

Abstract

The technique of data augmentation (DA) is often used in machine learning for regularization purposes to better generalize under i.i.d. settings. In this work, we present a unifying framework with topics in causal inference to make a case for the use of DA beyond just the i.i.d. setting, but for generalization across interventions as well. Specifically, we argue that when the outcome generating mechanism is invariant to our choice of DA, then such augmentations can effectively be thought of as interventions on the treatment generating mechanism itself. This can potentially help to reduce bias in causal effect estimation arising from hidden confounders. In the presence of such unobserved confounding we typically make use of instrumental variables (IVs) -- sources of treatment randomization that are conditionally independent of the outcome. However, IVs may not be as readily available as DA for many applications, which is the main motivation behind this work. By appropriately regularizing IV based estimators, we introduce the concept of IV-like (IVL) regression for mitigating confounding bias and improving predictive performance across interventions even when certain IV properties are relaxed. Finally, we cast parameterized DA as an IVL regression problem and show that when used in composition can simulate a worst-case application of such DA, further improving performance on causal estimation and generalization tasks beyond what simple DA may offer. This is shown both theoretically for the population case and via simulation experiments for the finite sample case using a simple linear example. We also present real data experiments to support our case.

Paper Structure

This paper contains 45 sections, 14 theorems, 98 equations, 9 figures, 2 tables.

Key Result

theorem 1

For SEM $\mathfrak{M}$ in example:ivl, the following holds:

Figures (9)

  • Figure 1: A picture summary of our contributions. $\rightarrow$ represents composition of operations or transformations, and $\Leftrightarrow$ represents equivalence.
  • Figure 2: The observational distribution of $(GX, Y, G, C)$ and $(X, Y, G, C)$ for graphs (a) and (b) respectively are the same. The former applies DA on $X$, whereas the later applies a (soft) intervention on $X$. Furthermore, for the graph in (b), $G$ is IVL.
  • Figure 3: The ground truth function $\mathbf{f}$ in \ref{['example:da']}. The DA applied here corresponds to randomly translating the data samples along their level-set by adding random noise sampled from the null-space of $\mathbf{f}$.
  • Figure 4: Simulation experiment for a linear Gaussian SEM. $\kappa$ represents the amount of confounding, $\gamma$ is the strength of DA and $\alpha$ is the IVL regularization parameter. Each data-point represents the average ${ \operatorname{nCER}_{ }\IfNoValueTF{-NoValue-}{}{ (*){-NoValue-} } }$ over $25$ trials with a $95\%$ confidence interval (CI).
  • Figure 5: Experiment results; common OOD generalisation benchmarks compared against the ERM, DA+ERM and DA+IV baselines including DA+IVL.
  • ...and 4 more figures

Theorems & Definitions (24)

  • example 1: a linear Gaussian IVL example
  • theorem 1: robust prediction with regression
  • theorem 2: causal estimation with regression
  • example 2: a linear Gaussian DA example
  • theorem 3: causal estimation with DA+ERM
  • corollary 1: worst-case DA with DA+ regression
  • corollary 2: causal estimation with DA+ regression
  • proposition 1: $_\alpha$ closed form solution
  • proposition 1: $_\alpha$ closed form solution
  • proof
  • ...and 14 more