Table of Contents
Fetching ...

Synthetic Interventions

Anish Agarwal, Devavrat Shah, Dennis Shen

TL;DR

The paper developsSynthetic Interventions (SI), a framework that extends synthetic controls to multiple treatments by using a low-rank tensor factor model that captures unit, time, and treatment latent structure. It recasts counterfactual estimation as tensor completion, provides identification and a generalized SI estimator (SI-PCR), and proves consistency with asymptotic normality under additional assumptions. Through simulations and a replication of the Proposition 99 study, the authors show how SI can reveal relationships between anti-tobacco programs and tax increases, and demonstrate practical performance under covariate-shift-like conditions. The approach enables causal inference across multiple treatments within panel data, preserving interpretability via weighted donor combinations and offering a pathway to inference with theoretically grounded error bounds. Overall, SI broadens the policy-evaluation toolkit by accommodating multiple treatments and leveraging tensor-structured latent factors for robust counterfactual prediction.

Abstract

The synthetic controls (SC) methodology is a prominent tool for policy evaluation in panel data applications. Researchers commonly justify the SC framework with a low-rank matrix factor model that assumes the potential outcomes are described by low-dimensional unit and time specific latent factors. In the recent work of [Abadie '20], one of the pioneering authors of the SC method posed the question of how the SC framework can be extended to multiple treatments. This article offers one resolution to this open question that we call synthetic interventions (SI). Fundamental to the SI framework is a low-rank tensor factor model, which extends the matrix factor model by including a latent factorization over treatments. Under this model, we propose a generalization of the standard SC-based estimators. We prove the consistency for one instantiation of our approach and provide conditions under which it is asymptotically normal. Moreover, we conduct a representative simulation to study its prediction performance and revisit the canonical SC case study of [Abadie-Diamond-Hainmueller '10] on the impact of anti-tobacco legislations by exploring related questions not previously investigated.

Synthetic Interventions

TL;DR

The paper developsSynthetic Interventions (SI), a framework that extends synthetic controls to multiple treatments by using a low-rank tensor factor model that captures unit, time, and treatment latent structure. It recasts counterfactual estimation as tensor completion, provides identification and a generalized SI estimator (SI-PCR), and proves consistency with asymptotic normality under additional assumptions. Through simulations and a replication of the Proposition 99 study, the authors show how SI can reveal relationships between anti-tobacco programs and tax increases, and demonstrate practical performance under covariate-shift-like conditions. The approach enables causal inference across multiple treatments within panel data, preserving interpretability via weighted donor combinations and offering a pathway to inference with theoretically grounded error bounds. Overall, SI broadens the policy-evaluation toolkit by accommodating multiple treatments and leveraging tensor-structured latent factors for robust counterfactual prediction.

Abstract

The synthetic controls (SC) methodology is a prominent tool for policy evaluation in panel data applications. Researchers commonly justify the SC framework with a low-rank matrix factor model that assumes the potential outcomes are described by low-dimensional unit and time specific latent factors. In the recent work of [Abadie '20], one of the pioneering authors of the SC method posed the question of how the SC framework can be extended to multiple treatments. This article offers one resolution to this open question that we call synthetic interventions (SI). Fundamental to the SI framework is a low-rank tensor factor model, which extends the matrix factor model by including a latent factorization over treatments. Under this model, we propose a generalization of the standard SC-based estimators. We prove the consistency for one instantiation of our approach and provide conditions under which it is asymptotically normal. Moreover, we conduct a representative simulation to study its prediction performance and revisit the canonical SC case study of [Abadie-Diamond-Hainmueller '10] on the impact of anti-tobacco legislations by exploring related questions not previously investigated.

Paper Structure

This paper contains 61 sections, 23 theorems, 90 equations, 4 figures, 2 tables.

Key Result

Theorem 1

Given $(i,d)$, let Assumptions assumption:sutva to assumption:linear hold. Then we have

Figures (4)

  • Figure 1: Figure \ref{['fig:tensor.ideal']} visualizes the potential outcomes tensor, $\boldsymbol{Y}^*$, while Figure \ref{['fig:tensor.obs']} visualizes the observed tensor, $\boldsymbol{Y}$, as per Assumption \ref{['assumption:sutva']} and tailored to our tobacco study. The colored blocks indicate observed entries with the color indexing the treatment: status quo in gray, anti-tobacco programs in blue, and raised taxes in orange. The white blocks indicate missing entries.
  • Figure 2: Simulation displays the spectrum of $\boldsymbol{Y} = \mathbb{E}[\boldsymbol{Y}] + \boldsymbol{E} \in \mathbb{R}^{100 \times 100}$. Here, $\mathbb{E}[\boldsymbol{Y}] = \boldsymbol{U} \boldsymbol{V}^\top$, where the entries of $\boldsymbol{U}, \boldsymbol{V} \in \mathbb{R}^{100 \times 10}$ are sampled independently from $\mathcal{N}(0,1)$; the entries of $\boldsymbol{E}$ are sampled independently from $\mathcal{N}(0, \sigma^2)$ with $\sigma^2 \in \{0, 0.2, \dots, 0.8\}$. Across varying levels of $\sigma^2$, there is a steep drop-off in magnitude of the singular values---this marks the "elbow" point. The top singular values of $\boldsymbol{Y}$ correspond closely with that of $\mathbb{E}[\boldsymbol{Y}]$ ($\sigma^2=0$), and the remaining singular values are induced by $\boldsymbol{E}$. Thus, $\rank(\boldsymbol{Y}) \approx \rank(\mathbb{E}[\boldsymbol{Y}]) = 10$.
  • Figure 3: Simulation results displaying the absolute point-wise prediction errors across $T_0 = N_d \in \{25, 50, 75, \dots, 200\}$. Solid lines show the mean over $50$ trials, with shading to show $\pm$ one standard error. Blue and orange lines show the errors when Assumption \ref{['assumption:subspace']} holds ($\text{✓}$) and fails ($\text{✗}$), respectively. In line with Theorem \ref{['thm:consistency']}, the $\text{✓}$ errors decay as $T_0$ and $N_d$ grow. By contrast, the $\text{✗}$ errors fail to converge, which provides empirical evidence for the importance of Assumption \ref{['assumption:subspace']}.
  • Figure 4: The solid and dashed-dotted lines shows the observed and estimated trajectory of cigarette sales, respectively. Treatments are indexed by color: status quo in black, anti-tobacco program in blue, and raised taxes in orange. The vertical dotted gray line represents the year 1988 when California passed Proposition 99; this splits the time horizon into the pre-treatment period (1970--1988) and post-treatment period (1989--2000). Figures \ref{['fig:kansas']}, \ref{['fig:arizona']}, and \ref{['fig:jersey']} show the results for an example state that in actuality kept the status quo (Kansas), imposed an anti-tobacco program (Arizona), and raised taxes (New Jersey), respectively.

Theorems & Definitions (47)

  • Theorem 1
  • Remark 1: On Assumption \ref{['assumption:form']}
  • Remark 2: On Assumption \ref{['assumption:conditional_mean_zero']}
  • Remark 3: On Assumption \ref{['assumption:linear']}
  • Theorem 2
  • Remark 4: Asymptotic normality
  • Remark 5: On Assumption \ref{['assumption:noise']}
  • Remark 6: On Assumption \ref{['assumption:boundedness']}
  • Remark 7: On Assumption \ref{['assumption:spectra']}
  • Remark 8: On Assumption \ref{['assumption:subspace']}
  • ...and 37 more