Table of Contents
Fetching ...

Density Ratio-based Proxy Causal Learning Without Density Ratios

Bariscan Bozkurt, Ben Deaner, Dimitri Meunier, Liyuan Xu, Arthur Gretton

TL;DR

This work introduces a density-ratio-free proxy causal learning method that leverages a treatment bridge function within an RKHS to identify and estimate dose-response and conditional dose-response curves under hidden confounding. By formulating the problem with kernel mean embeddings and a three-stage regression (including a third-stage regression for ATT), the authors derive closed-form estimators for $f_{ATE}$ and $f_{ATT}$ and prove non-asymptotic consistency under standard RKHS and completeness assumptions. The approach avoids explicit density ratio estimation, enabling effective handling of continuous and high-dimensional treatments, with strong theoretical guarantees and extensive empirical validation on synthetic and real data. Overall, the paper advances PCL by providing a practical, scalable, and theoretically sound density-ratio-free framework for causal effect estimation in the presence of unobserved confounding.

Abstract

We address the setting of Proxy Causal Learning (PCL), which has the goal of estimating causal effects from observed data in the presence of hidden confounding. Proxy methods accomplish this task using two proxy variables related to the latent confounder: a treatment proxy (related to the treatment) and an outcome proxy (related to the outcome). Two approaches have been proposed to perform causal effect estimation given proxy variables; however only one of these has found mainstream acceptance, since the other was understood to require density ratio estimation - a challenging task in high dimensions. In the present work, we propose a practical and effective implementation of the second approach, which bypasses explicit density ratio estimation and is suitable for continuous and high-dimensional treatments. We employ kernel ridge regression to derive estimators, resulting in simple closed-form solutions for dose-response and conditional dose-response curves, along with consistency guarantees. Our methods empirically demonstrate superior or comparable performance to existing frameworks on synthetic and real-world datasets.

Density Ratio-based Proxy Causal Learning Without Density Ratios

TL;DR

This work introduces a density-ratio-free proxy causal learning method that leverages a treatment bridge function within an RKHS to identify and estimate dose-response and conditional dose-response curves under hidden confounding. By formulating the problem with kernel mean embeddings and a three-stage regression (including a third-stage regression for ATT), the authors derive closed-form estimators for and and prove non-asymptotic consistency under standard RKHS and completeness assumptions. The approach avoids explicit density ratio estimation, enabling effective handling of continuous and high-dimensional treatments, with strong theoretical guarantees and extensive empirical validation on synthetic and real data. Overall, the paper advances PCL by providing a practical, scalable, and theoretically sound density-ratio-free framework for causal effect estimation in the presence of unobserved confounding.

Abstract

We address the setting of Proxy Causal Learning (PCL), which has the goal of estimating causal effects from observed data in the presence of hidden confounding. Proxy methods accomplish this task using two proxy variables related to the latent confounder: a treatment proxy (related to the treatment) and an outcome proxy (related to the outcome). Two approaches have been proposed to perform causal effect estimation given proxy variables; however only one of these has found mainstream acceptance, since the other was understood to require density ratio estimation - a challenging task in high dimensions. In the present work, we propose a practical and effective implementation of the second approach, which bypasses explicit density ratio estimation and is suitable for continuous and high-dimensional treatments. We employ kernel ridge regression to derive estimators, resulting in simple closed-form solutions for dose-response and conditional dose-response curves, along with consistency guarantees. Our methods empirically demonstrate superior or comparable performance to existing frameworks on synthetic and real-world datasets.

Paper Structure

This paper contains 62 sections, 49 theorems, 377 equations, 7 figures, 1 table.

Key Result

Theorem 3.4

Let Assumptions (assum:ProxyCausalAssumptions1) and (assum:AlternativeProxyAssumptionCompleteness1) hold. Furthermore, suppose that there exist square integrable functions $\varphi_0^{\text{ATE}}$ and $\varphi_0^{\text{ATT}}$ that satisfy Equations (eq:AlternativeProxyATEBridgeFunction) and (eq:Alte

Figures (7)

  • Figure 1: An instance of a Directed Acyclic Graph (DAG) for the PCL setting, which satisfies the required Assumption (\ref{['assumption:proxy']}). In this graph, the gray circles denote the observed variables: $A$ denotes the treatment, $Y$ denotes the outcome, $Z$ denotes the treatment proxy, and $W$ denotes the outcome proxy. The white circle denotes the unobserved confounding variable $U$. Bi-directional dotted arrows indicate that either direction in the DAG is possible, or that both variables may share a common ancestor.
  • Figure 2: Dose-response curve estimation across various datasets and algorithms: Kernel Alternative Proxy (Ours), PKIPW wu2024doubly, Kernel Negative Control singh2023kernelmethodsunobservedconfounding, KPV Mastouri2021ProximalCL, and PMMR Mastouri2021ProximalCL. (a) Synthetic low-dimensional setting, (b) dSprite dataset, (c) legalized abortion and crime dataset, and (d) grade retention and cognitive outcome datasets.
  • Figure 3: Conditional dose-response curve estimation for synthetic low-dimensional data across $a'$ values and algorithms (averaged over 30 different runs) - mean solid line and standard deviation envelopes.
  • Figure 4: Dose-response estimation curves for the Job Corps experimental settings that are introduced in S.M. (Sec. \ref{['sec:Appendix_JobCorpsExperiments']}). Panels (a)-(l) illustrate the estimation curves for our approach, KPV, KNC, and the oracle method Kernel-ATE across Settings 1-12, respectively.
  • Figure 5: Conditional dose-response estimation curves for Job Corps experimental settings $1$, $2$, $5$, and $6$ that are introduced in S.M. (Sec. \ref{['sec:Appendix_JobCorpsExperiments']}). Panels (a) and (b) show estimation curves for our approach, KNC, and the oracle method Kernel-ATT in Setting $1$ for $a' = 500$ and $a' = 1000$, respectively. Panels (c) and (d) display the corresponding curves for Setting $2$. Similarly, panels (e) and (f) illustrate the results for Setting $5$, while panels (g) and (h) present those for Setting $6$.
  • ...and 2 more figures

Theorems & Definitions (100)

  • Definition 3.1
  • Theorem 3.4
  • Remark 3.5
  • Remark 3.6
  • Theorem 5.4
  • Theorem 9.1
  • proof
  • Theorem 9.2
  • proof
  • Theorem 10.1: Picard's Theorem; Theorem 15.8 in linearIntegralEquationskress2013
  • ...and 90 more