A Causal Inference Framework for Data Rich Environments
Alberto Abadie, Anish Agarwal, Devavrat Shah
TL;DR
This paper addresses unobserved confounding in data-rich settings where both the number of units $N$ and the number of measurements per unit $T$ are large. It develops a latent-factor framework that links structural causal models with latent factor models, showing that potential outcomes $Y^{(a)}_{n,t}$ can be approximated by a low-rank linear factor representation under Hölder-smooth latent structure. The authors derive identification results for $ATE$, $ATT$, and $ATU$ under a linear-span condition and propose a PCR-based estimator that achieves finite-sample consistency and asymptotic normality under tractable conditions. The approach nests classical econometric models (e.g., two-way FE, interactive FE) and provides practical tools for counterfactual inference in high-dimensional data environments, with explicit rates and assumptions on smoothness, spectrum, and low-rank approximation.
Abstract
We propose a formal model for counterfactual estimation with unobserved confounding in "data-rich" settings, i.e., where there are a large number of units and a large number of measurements per unit. Our model provides a bridge between the structural causal model view of causal inference common in the graphical models literature with that of the latent factor model view common in the potential outcomes literature. We show how classic models for potential outcomes and treatment assignments fit within our framework. We provide an identification argument for the average treatment effect, the average treatment effect on the treated, and the average treatment effect on the untreated. For any estimator that has a fast enough estimation error rate for a certain nuisance parameter, we establish it is consistent for these various causal parameters. We then show principal component regression is one such estimator that leads to consistent estimation, and we analyze the minimal smoothness required of the potential outcomes function for consistency.
