Table of Contents
Fetching ...

Disparate Effect Of Missing Mediators On Transportability of Causal Effects

Vishwali Mhasawade, Rumi Chunara

TL;DR

This paper tackles the transportability of causal effects when mediators are incompletely observed in the target population, a setting that can bias transported indirect effects. It introduces a TMLE-based transport framework augmented with a MNAR mediator sensitivity analysis, deriving bounds on the transported indirect effect $ ext{SIE}$ as a function of residual weight variance $R^2$ and a variance-based model parameter. Through simulations and an application to Moving to Opportunity (MTO), the authors show that missing mediator data can differently distort effects across disadvantaged and advantaged groups, with a practical threshold (around $R^2 ext{=}0.29$) beyond which the disadvantaged group’s transported indirect effect can become non-significant while the advantaged group's remains detectable. The work provides a principled way to assess robustness of transported mediation estimates under MNAR missingness and highlights the importance of sensitivity analyses in policy-relevant transportability settings. Overall, the framework helps quantify how much missing mediator data can be tolerated before inference about transported mediation effects becomes unreliable in public-health contexts.

Abstract

Transported mediation effects provide an avenue to understand how upstream interventions (such as improved neighborhood conditions like green spaces) would work differently when applied to different populations as a result of factors that mediate the effects. However, when mediators are missing in the population where the effect is to be transported, these estimates could be biased. We study this issue of missing mediators, motivated by challenges in public health, wherein mediators can be missing, not at random. We propose a sensitivity analysis framework that quantifies the impact of missing mediator data on transported mediation effects. This framework enables us to identify the settings under which the conditional transported mediation effect is rendered insignificant for the subgroup with missing mediator data. Specifically, we provide the bounds on the transported mediation effect as a function of missingness. We then apply the framework to longitudinal data from the Moving to Opportunity Study, a large-scale housing voucher experiment, to quantify the effect of missing mediators on transport effect estimates of voucher receipt, an upstream intervention on living location, in childhood on subsequent risk of mental health or substance use disorder mediated through parental health across sites. Our findings provide a tangible understanding of how much missing data can be withstood for unbiased effect estimates.

Disparate Effect Of Missing Mediators On Transportability of Causal Effects

TL;DR

This paper tackles the transportability of causal effects when mediators are incompletely observed in the target population, a setting that can bias transported indirect effects. It introduces a TMLE-based transport framework augmented with a MNAR mediator sensitivity analysis, deriving bounds on the transported indirect effect as a function of residual weight variance and a variance-based model parameter. Through simulations and an application to Moving to Opportunity (MTO), the authors show that missing mediator data can differently distort effects across disadvantaged and advantaged groups, with a practical threshold (around ) beyond which the disadvantaged group’s transported indirect effect can become non-significant while the advantaged group's remains detectable. The work provides a principled way to assess robustness of transported mediation estimates under MNAR missingness and highlights the importance of sensitivity analyses in policy-relevant transportability settings. Overall, the framework helps quantify how much missing mediator data can be tolerated before inference about transported mediation effects becomes unreliable in public-health contexts.

Abstract

Transported mediation effects provide an avenue to understand how upstream interventions (such as improved neighborhood conditions like green spaces) would work differently when applied to different populations as a result of factors that mediate the effects. However, when mediators are missing in the population where the effect is to be transported, these estimates could be biased. We study this issue of missing mediators, motivated by challenges in public health, wherein mediators can be missing, not at random. We propose a sensitivity analysis framework that quantifies the impact of missing mediator data on transported mediation effects. This framework enables us to identify the settings under which the conditional transported mediation effect is rendered insignificant for the subgroup with missing mediator data. Specifically, we provide the bounds on the transported mediation effect as a function of missingness. We then apply the framework to longitudinal data from the Moving to Opportunity Study, a large-scale housing voucher experiment, to quantify the effect of missing mediators on transport effect estimates of voucher receipt, an upstream intervention on living location, in childhood on subsequent risk of mental health or substance use disorder mediated through parental health across sites. Our findings provide a tangible understanding of how much missing data can be withstood for unbiased effect estimates.
Paper Structure (15 sections, 18 equations, 4 figures)

This paper contains 15 sections, 18 equations, 4 figures.

Figures (4)

  • Figure 1: Different missingness patterns for the mediator $C$ illustrated using causal graph for missing completely at random (MCAR) in (a) where $C \mathrel{\hbox{$\perp$}\mkern2mu{\perp}} M$, missing at random (MAR) in (b) where $C \mathrel{\hbox{$\perp$}\mkern2mu{\perp}} M \mid A, R$, and missing not at random (MNAR) in (c) where $C \not\!\perp\!\!\!\perp M \mid A,R$.
  • Figure 2: Example figure representing the interaction between neighborhood factors and CVD risk; neighborhood SES ($A$) affects individual behaviors related to CVD, such as alcohol consumption, which are mediators ($C$), while $C$ also affects CVD risk. $A$ also affects the alcohol resources in the neighborhood, $R$ which in turn affects $C$. Alcohol consumption and CVD risk can be confounded by 'race', $W$, which also defines disadvantaged and advantaged groups based on the value of 'race'. Behavioral data for individuals related to alcohol consumption, the mediators can be missing for specific subgroups sania2021k, which is denoted by the missingness indicator ($M$). We also observe a distribution shift with respect to the individual behavior related to alcohol consumption which is represented by the selection node ($S$).
  • Figure 3: Results from the sensitivity analysis under the sensitivity framework. We vary the bias in $R^2$ measure due to missing mediator data across the x-axis for the disadvantaged and advantaged groups independently and plot the range of estimated stochastic indirect effect (SIE) values on the y-axis. The solid bar denotes the point estimate bounds for a specified bias in $R^2$ value, estimated as the point estimate plus 95% confidence intervals. We estimate a bias in $R^2$ = 0.29, such that if $R^2 \geq 0.29$, the intervals contain the null estimate for the SIE.
  • Figure 4: Results from the sensitivity analysis under the sensitivity framework for MTO data with Los Angeles as the target environment in (a), Boston as the target environment in (b), New York City as the target environment in (c), and Chicago as the target environment in (d). We vary the bias in $R^2$ due to missing mediator (parental health) across the x-axis for the minority and majority groups independently and plot the range of estimated stochastic indirect effect (SIE) values on the y-axis. The solid bar denotes the point estimate bounds for a specified proportion of missing mediator data, estimated as the point estimate plus 95% confidence intervals.

Theorems & Definitions (1)

  • Definition 4.1