Table of Contents
Fetching ...

Bayesian Semiparametric Causal Inference: Targeted Doubly Robust Estimation of Treatment Effects

Gözde Sert, Abhishek Chakrabortty, Anirban Bhattacharya

TL;DR

The paper tackles causal inference for observational data with high-dimensional nuisance parameters by introducing DRDB, a Bayesian semiparametric approach that debiases nuisance estimation through a targeted, summary-statistics-based modeling and a density-ratio retargeting step. By decoupling nuisance learning from target inference via sample splitting and cross-fitting, DRDB yields marginal posteriors for the ATE and one-arm means that satisfy Bernstein-von Mises theorems under mild conditions and exhibit Bayesian double robustness when only one nuisance model is correctly specified. The methodology relies on a hierarchical framework that integrates debiased nuisance biases into a conditional likelihood, with tractable posteriors for the target parameters and efficient use of the full data through CF aggregation. Theoretical guarantees are complemented by extensive simulations and a real-data application (NHEFS), demonstrating accurate point estimates and nominal coverage with high-dimensional nuisance models, and the approach naturally extends to a broad class of causal estimands beyond the ATE.

Abstract

We propose a semiparametric Bayesian methodology for estimating the average treatment effect (ATE) within the potential outcomes framework using observational data with high-dimensional nuisance parameters. Our method introduces a Bayesian debiasing procedure that corrects for bias arising from nuisance estimation and employs a targeted modeling strategy based on summary statistics rather than the full data. These summary statistics are identified in a debiased manner, enabling the estimation of nuisance bias via weighted observables and facilitating hierarchical learning of the ATE. By combining debiasing with sample splitting, our approach separates nuisance estimation from inference on the target parameter, reducing sensitivity to nuisance model specification. We establish that, under mild conditions, the marginal posterior for the ATE satisfies a Bernstein-von Mises theorem when both nuisance models are correctly specified and remains consistent and robust when only one is correct, achieving Bayesian double robustness. This ensures asymptotic efficiency and frequentist validity. Extensive simulations confirm the theoretical results, demonstrating accurate point estimation and credible intervals with nominal coverage, even in high-dimensional settings. The proposed framework can also be extended to other causal estimands, and its key principles offer a general foundation for advancing Bayesian semiparametric inference more broadly.

Bayesian Semiparametric Causal Inference: Targeted Doubly Robust Estimation of Treatment Effects

TL;DR

The paper tackles causal inference for observational data with high-dimensional nuisance parameters by introducing DRDB, a Bayesian semiparametric approach that debiases nuisance estimation through a targeted, summary-statistics-based modeling and a density-ratio retargeting step. By decoupling nuisance learning from target inference via sample splitting and cross-fitting, DRDB yields marginal posteriors for the ATE and one-arm means that satisfy Bernstein-von Mises theorems under mild conditions and exhibit Bayesian double robustness when only one nuisance model is correctly specified. The methodology relies on a hierarchical framework that integrates debiased nuisance biases into a conditional likelihood, with tractable posteriors for the target parameters and efficient use of the full data through CF aggregation. Theoretical guarantees are complemented by extensive simulations and a real-data application (NHEFS), demonstrating accurate point estimates and nominal coverage with high-dimensional nuisance models, and the approach naturally extends to a broad class of causal estimands beyond the ATE.

Abstract

We propose a semiparametric Bayesian methodology for estimating the average treatment effect (ATE) within the potential outcomes framework using observational data with high-dimensional nuisance parameters. Our method introduces a Bayesian debiasing procedure that corrects for bias arising from nuisance estimation and employs a targeted modeling strategy based on summary statistics rather than the full data. These summary statistics are identified in a debiased manner, enabling the estimation of nuisance bias via weighted observables and facilitating hierarchical learning of the ATE. By combining debiasing with sample splitting, our approach separates nuisance estimation from inference on the target parameter, reducing sensitivity to nuisance model specification. We establish that, under mild conditions, the marginal posterior for the ATE satisfies a Bernstein-von Mises theorem when both nuisance models are correctly specified and remains consistent and robust when only one is correct, achieving Bayesian double robustness. This ensures asymptotic efficiency and frequentist validity. Extensive simulations confirm the theoretical results, demonstrating accurate point estimation and credible intervals with nominal coverage, even in high-dimensional settings. The proposed framework can also be extended to other causal estimands, and its key principles offer a general foundation for advancing Bayesian semiparametric inference more broadly.

Paper Structure

This paper contains 35 sections, 11 theorems, 107 equations, 2 figures, 3 tables, 1 algorithm.

Key Result

Proposition 1

Under the model construction and the prior given in eqn_model_constr_for_b1, the marginal posterior $\Pi_{b_t} \equiv \Pi_{b_t}(\cdot; S_t)$ for $b_t$ follows a $t$-distribution for $t = 0,1$. Specifically, for $n_t= |S_t|, \ \nu_t = n_t -1$,

Figures (2)

  • Figure 1: Box plots of posterior means (based on 500 replications) and overlaid density curves (based on 20 iterations) for the posteriors $\Pi_{\texttt{Oracle}}$ (pink) and $\Pi_\Delta$ (blue) of $\Delta$. The plots show results from using three methods (BART, BR, BS) to obtain the nuisance posteriors. The subfigures correspond to different values of $p$ and $s$. Each density curve is generated using 1000 posterior samples of $\Delta$. The red dashed vertical line indicates the true $\Delta^\dagger$ ($=2$ for all settings).
  • Figure 2: Box plots of posterior means and overlaid density curves for the posteriors $\Pi_{\texttt{Oracle}}$ and $\Pi_\Delta$ of $\Delta$ for $p =200$ and $s= 14$ or $50$. The rest of the caption details are the same as in Figure \ref{['fig_combined_p1050']}.

Theorems & Definitions (31)

  • Remark 1: Key methodological role of the data splitting
  • Remark 2: Hierarchical novelties
  • Remark 3
  • Remark 4: Role of PS
  • Remark 5
  • Remark 6: Scalability aspects
  • Proposition 1
  • Proposition 2
  • Remark 7: Some implementation details
  • Remark 8
  • ...and 21 more