Table of Contents
Fetching ...

Causal Inference for Experiments with Latent Outcomes: Key Results and Their Implications for Design and Analysis

Jiawei Fu, Donald P. Green

TL;DR

The paper tackles estimating causal effects when outcomes are latent and measured with error using multiple proxies. It introduces a design-based framework where latent outcomes are linked to observed measures via scaling parameters identified with instrumental variables, addressing study-specific noncomparability by fixing a reference scale. It develops two estimation paths—the optimally weighted scaled index (WSI) and structural equation modeling (SEM)—demonstrating that, with proper scaling, the ALTE can be efficiently and robustly estimated and compared across studies. The analysis provides practical guidance on when to collect more outcome measures versus more subjects, and emphasizes robustness checks, nonparametric extensions, and measurement-equivalence tests. An empirical application shows substantial gains in precision and robustness when using multiple latent measures, supporting broader adoption of design-based latent-outcome inference in experimental settings.

Abstract

How should researchers analyze randomized experiments in which the main outcome is latent and measured in multiple ways but each measure contains some degree of error? We first identify a critical study-specific noncomparability problem in existing methods for handling multiple measurements, which often rely on strong modeling assumptions or arbitrary standardization. Such approaches render the resulting estimands noncomparable across studies. To address the problem, we describe design-based approaches that enable researchers to identify causal parameters of interest, suggest ways that experimental designs can be augmented so as to make assumptions more credible, and discuss empirical tests of key assumptions. We show that when experimental researchers invest appropriately in multiple outcome measures, an optimally weighted scaled index of these measures enables researchers to obtain efficient and interpretable estimates of causal parameters by applying standard regression. An empirical application illustrates the gains in precision and robustness that multiple outcome measures can provide.

Causal Inference for Experiments with Latent Outcomes: Key Results and Their Implications for Design and Analysis

TL;DR

The paper tackles estimating causal effects when outcomes are latent and measured with error using multiple proxies. It introduces a design-based framework where latent outcomes are linked to observed measures via scaling parameters identified with instrumental variables, addressing study-specific noncomparability by fixing a reference scale. It develops two estimation paths—the optimally weighted scaled index (WSI) and structural equation modeling (SEM)—demonstrating that, with proper scaling, the ALTE can be efficiently and robustly estimated and compared across studies. The analysis provides practical guidance on when to collect more outcome measures versus more subjects, and emphasizes robustness checks, nonparametric extensions, and measurement-equivalence tests. An empirical application shows substantial gains in precision and robustness when using multiple latent measures, supporting broader adoption of design-based latent-outcome inference in experimental settings.

Abstract

How should researchers analyze randomized experiments in which the main outcome is latent and measured in multiple ways but each measure contains some degree of error? We first identify a critical study-specific noncomparability problem in existing methods for handling multiple measurements, which often rely on strong modeling assumptions or arbitrary standardization. Such approaches render the resulting estimands noncomparable across studies. To address the problem, we describe design-based approaches that enable researchers to identify causal parameters of interest, suggest ways that experimental designs can be augmented so as to make assumptions more credible, and discuss empirical tests of key assumptions. We show that when experimental researchers invest appropriately in multiple outcome measures, an optimally weighted scaled index of these measures enables researchers to obtain efficient and interpretable estimates of causal parameters by applying standard regression. An empirical application illustrates the gains in precision and robustness that multiple outcome measures can provide.

Paper Structure

This paper contains 51 sections, 5 theorems, 23 equations, 20 figures, 7 tables.

Key Result

Proposition 1

Suppose Assumption ass:frame holds. Then in the above causal framework given $\lambda_1=1$, $\lambda_j$ is identified either by (1) $\lambda_j=\frac{Cov(Z_i,Y_{ij})}{Cov(Z_i,Y_{i1})}$ if $\mathbb{E}[\eta_i^1-\eta_i^0]\neq 0$ or by (2) $\lambda_j=\frac{Cov(Y_{ik},Y_{ij})}{Cov(Y_{ik},Y_{i1})}\; \foral

Figures (20)

  • Figure 1: Graphical Depiction of an Experimental Design in which a Latent Outcome is Measured Linearly by Three Outcomes, Each Measured with Error.
  • Figure 2: Power analysis: WSI, Equal weighting, SEM, ICW, and PCA
  • Figure 3: Simulation Illustrating Variance Reduction Due to Collection of Additional Outcome Measures. The horizontal line represents the number of measures and the vertical line is the estimated variance of the optimal weighting estimator. The variance reduction is larger if the reliability is lower. A detailed explanation may be found in SI \ref{['si:measure']}.
  • Figure 4: Binary measurements and Nonparametric Estimation
  • Figure 5: Assessing linearity between $\eta$ and an additive index created by adding binary items together. We create an additive index $v_1$ by summing up 5, 10, 15, and 20 binary responses from IRT models. Data points have been jittered slightly for clarity. The rug plots on both axes denote the distribution of data. The vertical axis shows the additive index created by adding all binary variables in the simulation. The horizontal axis is the true latent variable ($\eta$) used in the IRT model.
  • ...and 15 more figures

Theorems & Definitions (13)

  • Proposition 1: Identification of the measurement scaling parameters
  • proof
  • Proposition 2
  • proof
  • Example 1: Binary measurements
  • Lemma A.1: Linearity
  • proof
  • proof
  • proof
  • Proposition A.1
  • ...and 3 more