Table of Contents
Fetching ...

Density Ratio-Free Doubly Robust Proxy Causal Learning

Bariscan Bozkurt, Houssam Zenati, Dimitri Meunier, Liyuan Xu, Arthur Gretton

TL;DR

The paper tackles estimating causal dose-response under unobserved confounding using Proxy Causal Learning (PCL). It develops two kernel-based, doubly robust estimators (DRKPV and DRPMMR) that fuse outcome-bridge and treatment-bridge identifications without explicit density ratio estimation, leveraging conditional mean embeddings in RKHS to yield closed-form, scalable solutions for continuous/high-dimensional treatments. The authors prove uniform consistency and demonstrate superior performance over baselines on synthetic and real datasets, with robustness to bridge-function misspecification. This work advances practical causal inference under hidden confounding by providing scalable, theoretically-grounded methods with strong empirical impact for policy evaluation and beyond.

Abstract

We study the problem of causal function estimation in the Proxy Causal Learning (PCL) framework, where confounders are not observed but proxies for the confounders are available. Two main approaches have been proposed: outcome bridge-based and treatment bridge-based methods. In this work, we propose two kernel-based doubly robust estimators that combine the strengths of both approaches, and naturally handle continuous and high-dimensional variables. Our identification strategy builds on a recent density ratio-free method for treatment bridge-based PCL; furthermore, in contrast to previous approaches, it does not require indicator functions or kernel smoothing over the treatment variable. These properties make it especially well-suited for continuous or high-dimensional treatments. By using kernel mean embeddings, we have closed-form solutions and strong consistency guarantees. Our estimators outperform existing methods on PCL benchmarks, including a prior doubly robust method that requires both kernel smoothing and density ratio estimation.

Density Ratio-Free Doubly Robust Proxy Causal Learning

TL;DR

The paper tackles estimating causal dose-response under unobserved confounding using Proxy Causal Learning (PCL). It develops two kernel-based, doubly robust estimators (DRKPV and DRPMMR) that fuse outcome-bridge and treatment-bridge identifications without explicit density ratio estimation, leveraging conditional mean embeddings in RKHS to yield closed-form, scalable solutions for continuous/high-dimensional treatments. The authors prove uniform consistency and demonstrate superior performance over baselines on synthetic and real datasets, with robustness to bridge-function misspecification. This work advances practical causal inference under hidden confounding by providing scalable, theoretically-grounded methods with strong empirical impact for policy evaluation and beyond.

Abstract

We study the problem of causal function estimation in the Proxy Causal Learning (PCL) framework, where confounders are not observed but proxies for the confounders are available. Two main approaches have been proposed: outcome bridge-based and treatment bridge-based methods. In this work, we propose two kernel-based doubly robust estimators that combine the strengths of both approaches, and naturally handle continuous and high-dimensional variables. Our identification strategy builds on a recent density ratio-free method for treatment bridge-based PCL; furthermore, in contrast to previous approaches, it does not require indicator functions or kernel smoothing over the treatment variable. These properties make it especially well-suited for continuous or high-dimensional treatments. By using kernel mean embeddings, we have closed-form solutions and strong consistency guarantees. Our estimators outperform existing methods on PCL benchmarks, including a prior doubly robust method that requires both kernel smoothing and density ratio estimation.

Paper Structure

This paper contains 35 sections, 22 theorems, 133 equations, 5 figures, 1 table, 5 algorithms.

Key Result

Theorem 2.4

Let Assumptions (assum:ProxyConditionalIndependenceAssumptions) and (assum:OutcomeBridgeCompleteness) hold. Furthermore, suppose that there exists an outcome bridge function $h_0(w, a)$ satisfying Then, the dose-response can be identified by $\theta_{A T E}(a) =\mathbb{E}[h_0(W, a)]$.

Figures (5)

  • Figure 1: A Directed Acyclic Graph (DAG) characterizing the structure assumed in the proximal causal learning (PCL) framework, satisfying Assumption (\ref{['assum:ProxyConditionalIndependenceAssumptions']}) Miao2018Identifying. Observed variables are shown as yellow nodes: $A$ denotes the treatment, $Y$ denotes the outcome, $Z$ denotes the treatment proxy, and $W$ denotes the outcome proxy. The unobserved confounder $U$ is depicted as a white node. Dotted bi-directional arrows indicate potential bidirectional causality or the existence of a shared latent ancestor between variables.
  • Figure 2: Dose-response curve estimation across various datasets and algorithms: DRKPV and DRPMMR (Ours), PKDR wu2024doubly, KAP bozkurt2025density, KNC singh2023kernelmethodsunobservedconfounding, KPV Mastouri2021ProximalCL, and PMMR Mastouri2021ProximalCL. (a) Synthetic low-dimensional setting, (b) dSprite dataset, (c) legalized abortion and crime dataset, and (d) grade retention and cognitive outcome datasets.
  • Figure 3: Experimental results in bridge function misspecifications with the synthetic low-dimensional data: (a, b) DRKPV estimates under outcome and treatment bridge misspecifications, respectively; (c, d) DRPMMR estimates under outcome and treatment bridge misspecifications, respectively.
  • Figure 4: Experimental result under bridge function misspecifications with the synthetic low dimensinal data generation process: (a) DRKPV estimation when the outcome bridge is misspecified, (b) DRKPV estimation when the treatment bridge function is misspecified, (c) DRPMMR estimation when the outcome bridge function is misspecified, (d) DRPMMR estimation when the treatment bridge is misspecified.
  • Figure 5: Dose-response estimation curves for the Job Corps experimental settings described in S.M. (\ref{['sec:AdditionalNumericalExperiments_JobCorp']}). Figures (\ref{['fig:JobCorpsMisspecified_Setting1']})-(\ref{['fig:JobCorpsMisspecified_Setting6']}) display the dose-response estimates from our proposed methods, DRKPV and DRPMMR, compared against KPV, PMMR, KNC, KAP, and the oracle method, Kernel-ATE.

Theorems & Definitions (44)

  • Definition 2.1
  • Theorem 2.4: Causal Identification with Outcome Bridge Function Miao2018Identifying
  • Theorem 2.6: Causal identification with treatment bridge function bozkurt2025density
  • Theorem 2.7: Doubly robust causal identification
  • Remark 2.8
  • Remark 2.9
  • Theorem 4.1
  • Theorem 4.2
  • Theorem A.1: Doubly robust causal identification; Replica of Theorem (\ref{['theorem:DoublyRobustIdentification']})
  • proof
  • ...and 34 more