PairNet: Training with Observed Pairs to Estimate Individual Treatment Effect
Lokesh Nagalapatti, Pranava Singhal, Avishek Ghosh, Sunita Sarawagi
TL;DR
PairNet addresses the challenge of estimating individual treatment effects (ITE) from observational data by training on observed pairs with a pairwise factual loss, eliminating reliance on noisy pseudo-outcomes. Theoretical contributions establish consistency and ITE-risk bounds under overlap, showing the Pair loss upper-bounds the ITE risk via an IPM-based distance between neighbor distributions and observed covariates, with tighter guarantees than factual models. Empirically, PairNet delivers significant improvements over a wide range of baselines across binary and continuous treatments, is model-agnostic, and demonstrates robustness to pairing proximity and hyperparameters. These results suggest PairNet offers a practical, scalable pathway to more accurate individualized treatment effect estimation in real-world observational datasets.
Abstract
Given a dataset of individuals each described by a covariate vector, a treatment, and an observed outcome on the treatment, the goal of the individual treatment effect (ITE) estimation task is to predict outcome changes resulting from a change in treatment. A fundamental challenge is that in the observational data, a covariate's outcome is observed only under one treatment, whereas we need to infer the difference in outcomes under two different treatments. Several existing approaches address this issue through training with inferred pseudo-outcomes, but their success relies on the quality of these pseudo-outcomes. We propose PairNet, a novel ITE estimation training strategy that minimizes losses over pairs of examples based on their factual observed outcomes. Theoretical analysis for binary treatments reveals that PairNet is a consistent estimator of ITE risk, and achieves smaller generalization error than baseline models. Empirical comparison with thirteen existing methods across eight benchmarks, covering both discrete and continuous treatments, shows that PairNet achieves significantly lower ITE error compared to the baselines. Also, it is model-agnostic and easy to implement.
