Table of Contents
Fetching ...

Regression Discontinuity Design with Distribution-Valued Outcomes

David Van Dijcke

TL;DR

This paper develops Regression Discontinuity Design for distribution-valued outcomes (R3D), defining the Local Average Quantile Treatment Effects (LAQTE) to capture shifts in entire outcome distributions around a treatment cutoff. It introduces two estimators—local polynomial regression on random quantiles and local Fréchet regression in 2-Wasserstein space—along with uniform, debiased confidence bands and data-driven bandwidth selection, with theoretical guarantees and a multiplier bootstrap for inference. Simulations show R3D estimators are less biased and provide valid uniform bands, outperforming quantile RD in this setting. The empirical application using a close-election RD to study gubernatorial control reveals an equality–efficiency trade-off: Democratic governorships tend to reduce high-end income and compress the distribution, while effects at the bottom are milder and often not significant. Overall, R3D extends causal inference to distributional outcomes, enabling nuanced, policy-relevant distribution-centered insights.

Abstract

This article introduces Regression Discontinuity Design (RDD) with Distribution-Valued Outcomes (R3D), extending the standard RDD framework to settings where the outcome is a distribution rather than a scalar. Such settings arise when treatment is assigned at a higher level of aggregation than the outcome-for example, when a subsidy is allocated based on a firm-level revenue cutoff while the outcome of interest is the distribution of employee wages within the firm. Since standard RDD methods cannot accommodate such two-level randomness, I propose a novel approach based on random distributions. The target estimand is a "local average quantile treatment effect", which averages across random quantiles. To estimate this target, I introduce two related approaches: one that extends local polynomial regression to random quantiles and another based on local Fréchet regression, a form of functional regression. For both estimators, I establish asymptotic normality and develop uniform, debiased confidence bands together with a data-driven bandwidth selection procedure. Simulations validate these theoretical properties and show existing methods to be biased and inconsistent in this setting. I then apply the proposed methods to study the effects of gubernatorial party control on within-state income distributions in the US, using a close-election design. The results suggest a classic equality-efficiency tradeoff under Democratic governorship, driven by reductions in income at the top of the distribution.

Regression Discontinuity Design with Distribution-Valued Outcomes

TL;DR

This paper develops Regression Discontinuity Design for distribution-valued outcomes (R3D), defining the Local Average Quantile Treatment Effects (LAQTE) to capture shifts in entire outcome distributions around a treatment cutoff. It introduces two estimators—local polynomial regression on random quantiles and local Fréchet regression in 2-Wasserstein space—along with uniform, debiased confidence bands and data-driven bandwidth selection, with theoretical guarantees and a multiplier bootstrap for inference. Simulations show R3D estimators are less biased and provide valid uniform bands, outperforming quantile RD in this setting. The empirical application using a close-election RD to study gubernatorial control reveals an equality–efficiency trade-off: Democratic governorships tend to reduce high-end income and compress the distribution, while effects at the bottom are milder and often not significant. Overall, R3D extends causal inference to distributional outcomes, enabling nuanced, policy-relevant distribution-centered insights.

Abstract

This article introduces Regression Discontinuity Design (RDD) with Distribution-Valued Outcomes (R3D), extending the standard RDD framework to settings where the outcome is a distribution rather than a scalar. Such settings arise when treatment is assigned at a higher level of aggregation than the outcome-for example, when a subsidy is allocated based on a firm-level revenue cutoff while the outcome of interest is the distribution of employee wages within the firm. Since standard RDD methods cannot accommodate such two-level randomness, I propose a novel approach based on random distributions. The target estimand is a "local average quantile treatment effect", which averages across random quantiles. To estimate this target, I introduce two related approaches: one that extends local polynomial regression to random quantiles and another based on local Fréchet regression, a form of functional regression. For both estimators, I establish asymptotic normality and develop uniform, debiased confidence bands together with a data-driven bandwidth selection procedure. Simulations validate these theoretical properties and show existing methods to be biased and inconsistent in this setting. I then apply the proposed methods to study the effects of gubernatorial party control on within-state income distributions in the US, using a close-election design. The results suggest a classic equality-efficiency tradeoff under Democratic governorship, driven by reductions in income at the top of the distribution.

Paper Structure

This paper contains 62 sections, 14 theorems, 135 equations, 14 figures, 2 tables.

Key Result

Lemma 1

Under Assumptions asspt:i_cont and asspt:i_dens, the unobserved $\tau^{\mathrm{R3D}}$ is identified from the joint distribution of the observed $(X,Y)$ as,

Figures (14)

  • Figure 1: Example of a Distribution-Valued RDD
  • Figure 2: Local Polynomial Estimator: Illustration
  • Figure 3: Simulated Bias of R3D and Q-RD Estimators
  • Figure 4: Simulated Variance of R3D Estimators
  • Figure 5: R3D Plot: Average Income Quantiles vs. Democrat Vote Share
  • ...and 9 more figures

Theorems & Definitions (35)

  • Example 1: Administrative units
  • Example 2: Institutions
  • Example 3: Establishments
  • Definition 1: Local Average Quantile Treatment Effects (LAQTE)
  • Example 4
  • proof
  • Lemma 1: Identification
  • Definition 2: Fuzzy LAQTE
  • Lemma 2: Fuzzy Identification
  • Theorem 1: Convergence: Conditional Means
  • ...and 25 more