Table of Contents
Fetching ...

Semiparametric causal mediation analysis of cluster-randomized trials for indirect and spillover effects

Chao Cheng, Fan Li

TL;DR

This work develops a comprehensive semiparametric framework for causal mediation analysis in cluster-randomized trials with informative cluster size and within-cluster interference. It derives efficient influence functions for cluster- and individual-level mediation estimands—$NIE$, $NDE$, $SME$, and $IME$—and proposes doubly robust estimators that integrate parametric and machine-learning nuisance models with cross-fitting and stabilization. Identification is established under standard CRT assumptions, extended to capture cross-world and interference aspects, enabling estimation of all mediation functionals via $ heta_V(a,a^*)$ and $ au_V$. Through simulations and an application to the Red de Protección Social trial, the approach demonstrates potential mediation through household dietary diversity while accommodating spillovers, though it notes cautions related to small samples and unmeasured confounding.

Abstract

In cluster-randomized trials (CRTs), there is emerging interest in exploring the causal mechanism in which a cluster-level treatment affects the outcome through an intermediate outcome. The majority of existing causal mediation methods are applicable to independent data and only a few exceptions have considered assessing causal mediation in CRTs, all of which heavily depend on parametric assumptions. In this article, we develop a formal semiparametric efficiency theory to motivate new doubly-robust methods for addressing different mediation effect estimands -- the natural indirect effect, individual mediation effect, and spillover mediation effect (the extent to which one's outcome is influenced by others' mediators). We derive the efficient influence function for each estimand, and carefully parameterize each efficient influence function to motivate practical estimators. We consider both parametric working models and data-adaptive machine learners to estimate the nuisance functions, and obtain the semiparametric efficient estimators in the latter case. We conduct simulation studies to demonstrate the finite-sample performance of our new estimators and illustrate our proposed methods by reanalyzing a real-world CRT.

Semiparametric causal mediation analysis of cluster-randomized trials for indirect and spillover effects

TL;DR

This work develops a comprehensive semiparametric framework for causal mediation analysis in cluster-randomized trials with informative cluster size and within-cluster interference. It derives efficient influence functions for cluster- and individual-level mediation estimands—, , , and —and proposes doubly robust estimators that integrate parametric and machine-learning nuisance models with cross-fitting and stabilization. Identification is established under standard CRT assumptions, extended to capture cross-world and interference aspects, enabling estimation of all mediation functionals via and . Through simulations and an application to the Red de Protección Social trial, the approach demonstrates potential mediation through household dietary diversity while accommodating spillovers, though it notes cautions related to small samples and unmeasured confounding.

Abstract

In cluster-randomized trials (CRTs), there is emerging interest in exploring the causal mechanism in which a cluster-level treatment affects the outcome through an intermediate outcome. The majority of existing causal mediation methods are applicable to independent data and only a few exceptions have considered assessing causal mediation in CRTs, all of which heavily depend on parametric assumptions. In this article, we develop a formal semiparametric efficiency theory to motivate new doubly-robust methods for addressing different mediation effect estimands -- the natural indirect effect, individual mediation effect, and spillover mediation effect (the extent to which one's outcome is influenced by others' mediators). We derive the efficient influence function for each estimand, and carefully parameterize each efficient influence function to motivate practical estimators. We consider both parametric working models and data-adaptive machine learners to estimate the nuisance functions, and obtain the semiparametric efficient estimators in the latter case. We conduct simulation studies to demonstrate the finite-sample performance of our new estimators and illustrate our proposed methods by reanalyzing a real-world CRT.
Paper Structure (13 sections, 4 theorems, 12 equations, 1 figure, 2 tables)

This paper contains 13 sections, 4 theorems, 12 equations, 1 figure, 2 tables.

Key Result

theorem 1

Under Assumptions assum:consistency--assum:observed_data, we can identify for any $a,a^*\in\{0,1\}$. Additionally, if Assumption assum:no_icc holds, $\tau_C$ and $\tau_I$ can be identified by

Figures (1)

  • Figure 1: Causal graphs of the causal relationships among variables in a cluster with $N_i=3$ individuals. A dashed edge indicates generic association with unknown causal structure. We omit all pre-treatment variables, $\{N_i,{\bm X}_i,{\bm V}_i\}$, and their associated causal pathways, but acknowledge that all pre-treatment variables should have direct pathways towards all mediators and outcomes ($M_{ij}$ and $Y_{ij}$ for all $j=1,2,3$). Panel (a) includes all pathways from treatment to the outcome. Panels (b)--(e) collect pathways associated with each mediation estimand.

Theorems & Definitions (4)

  • theorem 1
  • theorem 2
  • Proposition 1
  • theorem 3