The Identification Power of Combining Experimental and Observational Data for Distributional Treatment Effect Parameters
Shosei Sakaguchi
TL;DR
The paper develops a copula-based framework to identify distributional treatment effect parameters by combining experimental (randomized) and observational (self-selected) data. It shows that self-selection in observational data generally tightens the identified set for DTEs, except under selection-on-observables, and provides nonparametric sharp bounds for broad DTE classes (super-modular, $\varphi$-indicator) with and without self-selection. A linear programming approach is proposed to compute sharp bounds under additional structural restrictions such as mutually LTD dependence and Roy-style selection, enabling practical application. The empirical DRPT analysis on negative political advertising demonstrates substantial tightening of identification when observational data are incorporated, highlighting the design’s practical value. Overall, the work offers a rigorous identification toolkit for DTEs that leverages both data sources, with clear conditions for when gains arise and tractable computation via LPs.
Abstract
This study investigates the identification power gained by combining experimental data, in which treatment is randomized, with observational data, in which treatment is self-selected, for distributional treatment effect (DTE) parameters. While experimental data identify average treatment effects, many DTE parameters, such as the distribution of individual treatment effects, are only partially identified. We examine whether and how combining these two data sources tightens the identified set for such parameters. For broad classes of DTE parameters, we derive nonparametric sharp bounds under the combined data and clarify the mechanism through which data combination improves identification relative to using experimental data alone. Our analysis highlights that self-selection in observational data is a key source of identification power. We establish necessary and sufficient conditions under which the combined data shrink the identified set, showing that such shrinkage generally occurs unless selection-on-observables holds in the observational data. We also propose a linear programming approach to compute sharp bounds that can incorporate additional structural restrictions, such as positive dependence between potential outcomes and the generalized Roy selection model. An empirical application using data on negative campaign advertisements in the 2008 U.S. presidential election illustrates the practical relevance of the proposed approach.
