Table of Contents
Fetching ...

Double Debiased Machine Learning for Mediation Analysis with Continuous Treatments

Houssam Zenati, Judith Abécassis, Julie Josse, Bertrand Thirion

TL;DR

This paper tackles causal mediation analysis with continuous treatments by introducing a kernel-based double debiased learning (DML) estimator that achieves asymptotic normality under nonparametric nuisance learning. The core methodological advances are a Neyman-orthogonal kernel moment function and a Bayes-transformed cross-conditional mean that avoids mediator-density estimation, enabling scalable inference for high-dimensional mediators. The authors establish a thorough asymptotic theory, derive a data-driven AMSE-optimal bandwidth, and provide consistent confidence intervals even when nuisance components are misspecified. Empirically, the method outperforms traditional estimators across simulations and a UKBB cognitive-function application, offering improved stability at boundary regions and robust uncertainty quantification, with practical guidance on bandwidth choice and nuisance-learning strategies.

Abstract

Uncovering causal mediation effects is of significant value to practitioners seeking to isolate the direct treatment effect from the potential mediated effect. We propose a double machine learning (DML) algorithm for mediation analysis that supports continuous treatments. To estimate the target mediated response curve, our method uses a kernel-based doubly robust moment function for which we prove asymptotic Neyman orthogonality. This allows us to obtain asymptotic normality with nonparametric convergence rate while allowing for nonparametric or parametric estimation of the nuisance parameters. We then derive an optimal bandwidth strategy along with a procedure for estimating asymptotic confidence intervals. Finally, to illustrate the benefits of our method, we provide a numerical evaluation of our approach on a simulation along with an application to real-world medical data to analyze the effect of glycemic control on cognitive functions.

Double Debiased Machine Learning for Mediation Analysis with Continuous Treatments

TL;DR

This paper tackles causal mediation analysis with continuous treatments by introducing a kernel-based double debiased learning (DML) estimator that achieves asymptotic normality under nonparametric nuisance learning. The core methodological advances are a Neyman-orthogonal kernel moment function and a Bayes-transformed cross-conditional mean that avoids mediator-density estimation, enabling scalable inference for high-dimensional mediators. The authors establish a thorough asymptotic theory, derive a data-driven AMSE-optimal bandwidth, and provide consistent confidence intervals even when nuisance components are misspecified. Empirically, the method outperforms traditional estimators across simulations and a UKBB cognitive-function application, offering improved stability at boundary regions and robust uncertainty quantification, with practical guidance on bandwidth choice and nuisance-learning strategies.

Abstract

Uncovering causal mediation effects is of significant value to practitioners seeking to isolate the direct treatment effect from the potential mediated effect. We propose a double machine learning (DML) algorithm for mediation analysis that supports continuous treatments. To estimate the target mediated response curve, our method uses a kernel-based doubly robust moment function for which we prove asymptotic Neyman orthogonality. This allows us to obtain asymptotic normality with nonparametric convergence rate while allowing for nonparametric or parametric estimation of the nuisance parameters. We then derive an optimal bandwidth strategy along with a procedure for estimating asymptotic confidence intervals. Finally, to illustrate the benefits of our method, we provide a numerical evaluation of our approach on a simulation along with an application to real-world medical data to analyze the effect of glycemic control on cognitive functions.

Paper Structure

This paper contains 66 sections, 6 theorems, 146 equations, 6 figures, 5 tables, 1 algorithm.

Key Result

Lemma 2.1

Figures (6)

  • Figure 1: Causal graph for mediation analysis.
  • Figure 2: Bias of mediated response estimation on simulations with different sample sizes. DML significantly outperforms OLS and KME and also improves upon IPW.
  • Figure 3: Bias of mediated response estimation with the DML estimator on simulations with different sample sizes and with two bandwidth selection strategies.
  • Figure 4: Empirical histogram of glycated hemoglobin levels (treatments) in the UKBB data.
  • Figure 5: Effect estimation on the UKBB dataset for the total effect (left) and the indirect effect (right)
  • ...and 1 more figures

Theorems & Definitions (17)

  • Definition 2.1: Total average treatment effect
  • Definition 2.2: Direct effect
  • Definition 2.3: Indirect effect
  • Definition 2.4: Mediated response
  • Definition 2.5: Conditional mean outcome
  • Definition 2.6: Cross conditional mean outcome
  • Lemma 2.1: Pearl's mediation formula
  • Lemma 3.1
  • Theorem 8: Asymptotic normality
  • Remark 9
  • ...and 7 more