Table of Contents
Fetching ...

Compositional difference-in-differences

Onil Boussim

TL;DR

CoDiD addresses policy evaluations where outcomes are compositional vectors by jointly identifying effects on both totals and category shares. It hinges on a parallel growth assumption for log-quantities, which under a random utility model corresponds to parallel trends in expected utilities and yields parallel trajectories in the probability simplex; this enables point identification of counterfactual distributions and the computation of GTT, ATT, and CTT. The framework is extended to multiple pre-treatment periods with robust bounds when parallelism is questionable. An empirical application on early voting shows turnout gains and shifts in party support across the full composition, illustrating the method's practical relevance for policy analysis of multidimensional outcomes.

Abstract

Many policy evaluations involve vectors of category-specific quantities, either categorical outcomes (e.g., employment type, major choice) or compositional measures (e.g., GDP by sector, votes by party, electricity generation by source). In these settings, both intensive margins (shares) and extensive margins (totals) can matter. However, existing Difference-in-Differences (DiD) strategies typically focus only on the shares and do not jointly identify treatment effects on totals. In addition, these approaches usually lack a clear economic interpretation. I develop Compositional Difference-in-Differences (CoDiD), a new framework that identifies treatment effects on both shares and totals in a coherent way. The key assumption is parallel growth: in the absence of treatment, the log-quantities of each category would have evolved in parallel for the treated and control groups. I show that, under a random-utility discrete-choice model, this condition is equivalent to parallel trends in expected utilities, meaning that the change in average latent attractiveness for each alternative is identical across groups. Furthermore, geometrically, the counterfactual distributions (shares) follow parallel trajectories in the probability simplex. In settings with multiple time periods, I introduce a relaxation that delivers bounds when parallel growth may not hold. I illustrate the empirical relevance of the method by examining how early voting reforms affected the 2008 U.S. election.

Compositional difference-in-differences

TL;DR

CoDiD addresses policy evaluations where outcomes are compositional vectors by jointly identifying effects on both totals and category shares. It hinges on a parallel growth assumption for log-quantities, which under a random utility model corresponds to parallel trends in expected utilities and yields parallel trajectories in the probability simplex; this enables point identification of counterfactual distributions and the computation of GTT, ATT, and CTT. The framework is extended to multiple pre-treatment periods with robust bounds when parallelism is questionable. An empirical application on early voting shows turnout gains and shifts in party support across the full composition, illustrating the method's practical relevance for policy analysis of multidimensional outcomes.

Abstract

Many policy evaluations involve vectors of category-specific quantities, either categorical outcomes (e.g., employment type, major choice) or compositional measures (e.g., GDP by sector, votes by party, electricity generation by source). In these settings, both intensive margins (shares) and extensive margins (totals) can matter. However, existing Difference-in-Differences (DiD) strategies typically focus only on the shares and do not jointly identify treatment effects on totals. In addition, these approaches usually lack a clear economic interpretation. I develop Compositional Difference-in-Differences (CoDiD), a new framework that identifies treatment effects on both shares and totals in a coherent way. The key assumption is parallel growth: in the absence of treatment, the log-quantities of each category would have evolved in parallel for the treated and control groups. I show that, under a random-utility discrete-choice model, this condition is equivalent to parallel trends in expected utilities, meaning that the change in average latent attractiveness for each alternative is identical across groups. Furthermore, geometrically, the counterfactual distributions (shares) follow parallel trajectories in the probability simplex. In settings with multiple time periods, I introduce a relaxation that delivers bounds when parallel growth may not hold. I illustrate the empirical relevance of the method by examining how early voting reforms affected the 2008 U.S. election.

Paper Structure

This paper contains 23 sections, 7 theorems, 81 equations, 5 figures, 5 tables.

Key Result

Theorem 1

Under assumptions ass1 and apg, the counterfactual quantities are identified as:

Figures (5)

  • Figure 1: Visual representation of the DiD identification strategy. The counterfactual outcome for the treated group in the post-treatment period ($q^{0}_{1,1}$) is constructed by extrapolating the time trend observed in the control group (blue arrows) to the treated group’s pre-treatment level. The causal effect is the difference between the observed treated outcome ($q^{1}_{1,1}$) and this counterfactual ($q^{0}_{1,1}$), shown by the orange double-headed arrow.
  • Figure 2: Illustration of parallel growths in the simplex. The red curve represents the trajectory of the control group from pre-treatment ($\pi_{0,0}^{0}$) to post-treatment ($\pi_{0,1}^{0}$), while the blue curve shows the counterfactual trajectory of the treated group from pre-treatment ($\pi_{1,0}^{0}$) to post-treatment ($\pi_{1,1}^{0}$). Dashed lines indicate the linear translation (parallel growths).
  • Figure 3: log-ratios evolution 1992-2008 (treated vs. control)
  • Figure 4: shares evolution 1992-2008 (treated vs. control)
  • Figure 5: quantities evolution 1992-2008 (treated vs. control)

Theorems & Definitions (7)

  • Theorem 1
  • Proposition 1: Implication for shares
  • Proposition 2: Implication for expected utilities
  • Proposition 3: Implication for shares
  • Proposition 4
  • Proposition 5: Aitchison2002, egozcue2003isometric
  • Theorem 2: Partial Identification