Estimating individual treatment effect: generalization bounds and algorithms

Uri Shalit; Fredrik D. Johansson; David Sontag

Estimating individual treatment effect: generalization bounds and algorithms

Uri Shalit, Fredrik D. Johansson, David Sontag

TL;DR

The paper tackles estimating individualized treatment effects from observational data under strong ignorability by introducing CFR, a representation-learning framework that minimizes distributional differences between treated and control groups. It provides a theoretical IPM-based bound linking ITE error to standard factual loss plus a balance term, and presents a practical end-to-end algorithm with two outcome heads that regularizes for distributional imbalance. Empirical results on semi-synthetic IHDP and real-world Jobs data show CFR often matches or exceeds state-of-the-art methods, with notable gains under imbalance and in policy-risk scenarios. This work contributes a stability-oriented criterion for causal inference and a scalable non-linear approach for ITE estimation in observational settings.

Abstract

There is intense interest in applying machine learning to problems of causal inference in fields such as healthcare, economics and education. In particular, individual-level causal inference has important applications such as precision medicine. We give a new theoretical analysis and family of algorithms for predicting individual treatment effect (ITE) from observational data, under the assumption known as strong ignorability. The algorithms learn a "balanced" representation such that the induced treated and control distributions look similar. We give a novel, simple and intuitive generalization-error bound showing that the expected ITE estimation error of a representation is bounded by a sum of the standard generalization-error of that representation and the distance between the treated and control distributions induced by the representation. We use Integral Probability Metrics to measure distances between distributions, deriving explicit bounds for the Wasserstein and Maximum Mean Discrepancy (MMD) distances. Experiments on real and simulated data show the new algorithms match or outperform the state-of-the-art.

Estimating individual treatment effect: generalization bounds and algorithms

TL;DR

Abstract

Estimating individual treatment effect: generalization bounds and algorithms

TL;DR

Abstract

Paper Structure

Table of Contents

Key Result

Figures (5)

Theorems & Definitions (49)