Table of Contents
Fetching ...

On Causal Inference with Model-Based Outcomes

Dmitry Arkhangelsky, Kazuharu Yanagimoto, Tom Zohar

TL;DR

The paper addresses the challenge of estimating causal effects of group-level policies on model-based microdata-derived parameters. It identifies a novel endogenous weighting bias in standard one-step GMM/OLS approaches, where policy-induced changes in the microdata distribution alter the estimator weights, leading to inconsistency. To address this, it advocates a two-step Minimum Distance (MD) framework that first estimates group-specific parameters and then relates them to policy, separating parameter identification from policy evaluation; it also develops a robust auxiliary-information variant for settings with scarce within-group data. The empirical application to the 2005 Dutch childcare reform demonstrates substantial discrepancies between one-step estimates and MD estimates, underscoring the importance of explicit weighting and the practical benefits of the two-step approach for credible policy evaluation.

Abstract

We study the estimation of causal effects on group-level parameters identified from microdata (e.g., child penalties). We demonstrate that standard one-step methods (such as pooled OLS and IV regressions) are generally inconsistent due to an endogenous weighting bias, where the policy affects the implicit weights (e.g., altering fertility rates). In contrast, we advocate for a two-step Minimum Distance (MD) framework that explicitly separates parameter identification from policy evaluation. This approach eliminates the endogenous weighting bias and requires explicitly confronting sample selection when groups are small, thereby improving transparency. We show that the MD estimator performs well when parameters can be estimated for most groups, and propose a robust alternative that uses auxiliary information in settings with limited data. To illustrate the importance of this methodological choice, we evaluate the effect of the 2005 Dutch childcare reform on child penalties and find that the conventional one-step approach yields estimates that are substantially larger than those from the two-step method.

On Causal Inference with Model-Based Outcomes

TL;DR

The paper addresses the challenge of estimating causal effects of group-level policies on model-based microdata-derived parameters. It identifies a novel endogenous weighting bias in standard one-step GMM/OLS approaches, where policy-induced changes in the microdata distribution alter the estimator weights, leading to inconsistency. To address this, it advocates a two-step Minimum Distance (MD) framework that first estimates group-specific parameters and then relates them to policy, separating parameter identification from policy evaluation; it also develops a robust auxiliary-information variant for settings with scarce within-group data. The empirical application to the 2005 Dutch childcare reform demonstrates substantial discrepancies between one-step estimates and MD estimates, underscoring the importance of explicit weighting and the practical benefits of the two-step approach for credible policy evaluation.

Abstract

We study the estimation of causal effects on group-level parameters identified from microdata (e.g., child penalties). We demonstrate that standard one-step methods (such as pooled OLS and IV regressions) are generally inconsistent due to an endogenous weighting bias, where the policy affects the implicit weights (e.g., altering fertility rates). In contrast, we advocate for a two-step Minimum Distance (MD) framework that explicitly separates parameter identification from policy evaluation. This approach eliminates the endogenous weighting bias and requires explicitly confronting sample selection when groups are small, thereby improving transparency. We show that the MD estimator performs well when parameters can be estimated for most groups, and propose a robust alternative that uses auxiliary information in settings with limited data. To illustrate the importance of this methodological choice, we evaluate the effect of the 2005 Dutch childcare reform on child penalties and find that the conventional one-step approach yields estimates that are substantially larger than those from the two-step method.
Paper Structure (57 sections, 7 theorems, 113 equations, 4 figures, 6 tables)

This paper contains 57 sections, 7 theorems, 113 equations, 4 figures, 6 tables.

Key Result

Proposition 1

Suppose Assumption as:simple_model holds. Then the probability limit $B_{lim} = \text{plim}_{G\to\infty} \hat{B}^{GMM}$ satisfies $B_{lim} = B_0$ if and only if where $M_{\Gamma} = I_k - \Gamma(\Gamma^\top\Gamma)^{-1}\Gamma^\top$ projects orthogonal to the columns of $\Gamma$, and the covariance is over the population distribution of groups.

Figures (4)

  • Figure I: Childcare supply expansion
  • Figure II: GMM vs MD: Effect of the childcare provision expansion on CP
  • Figure I: Distribution of number of people with $x$ at $g$: $n_g(x)$
  • Figure II: Child penalties by age at first childbirth

Theorems & Definitions (20)

  • Remark 2.1: The role of heterogeneity
  • Remark 2.2: Choice between GMM and MD
  • Remark 2.3: Weighted oracle
  • Proposition 1
  • Corollary 1: Consistency of MD with Fixed $n_g$
  • Proposition 2: Bias Bound
  • Proposition 3: Asymptotic Normality
  • Remark 4.1: No Unbiased Estimation
  • Remark 5.1: On Normalizing Child Penalties
  • Lemma 1: Identification with Scalar Effects
  • ...and 10 more