A Density Ratio Super Learner

Wencheng Wu; David Benkeser

A Density Ratio Super Learner

Wencheng Wu, David Benkeser

TL;DR

The paper tackles the challenge of estimating density ratios, a quantity central to covariate shift and certain causal-inference estimands. It introduces a density ratio super learner that combines kernel- and classification-based learners within a cross-validated risk framework, guided by a novel qualified loss $L(O,\psi)$ defined as $L(O,\psi)=-\mathbb{I}(\lambda=1)\log\psi(x_1,x_2)+\mathbb{I}(\lambda=0)\log\psi(x_1,x_2)$, which ensures $E_0L(O,\psi)$ is minimized at the true ratio $\psi_0$. The method is evaluated via two Monte Carlo simulations—mediation analysis and LMTP—demonstrating that the density-ratio SL can asymptotically approach oracle performance and offers robust finite-sample behavior, particularly when sample sizes are small. Beyond causal inference, the approach provides a practical tool for tackling covariate shift and other density-ratio estimation problems in diverse domains by leveraging ensemble learning and a principled loss framework.

Abstract

The estimation of the ratio of two density probability functions is of great interest in many statistics fields, including causal inference. In this study, we develop an ensemble estimator of density ratios with a novel loss function based on super learning. We show that this novel loss function is qualified for building super learners. Two simulations corresponding to mediation analysis and longitudinal modified treatment policy in causal inference, where density ratios are nuisance parameters, are conducted to show our density ratio super learner's performance empirically.

A Density Ratio Super Learner

TL;DR

defined as

, which ensures

is minimized at the true ratio

. The method is evaluated via two Monte Carlo simulations—mediation analysis and LMTP—demonstrating that the density-ratio SL can asymptotically approach oracle performance and offers robust finite-sample behavior, particularly when sample sizes are small. Beyond causal inference, the approach provides a practical tool for tackling covariate shift and other density-ratio estimation problems in diverse domains by leveraging ensemble learning and a principled loss framework.

Abstract

Paper Structure (13 sections, 1 theorem, 21 equations, 3 figures, 2 tables)

This paper contains 13 sections, 1 theorem, 21 equations, 3 figures, 2 tables.

Introduction
Methods
Density Ratio Parameter
Density Ratios in Causal Inference
Mediation Analysis
Longitudinal Modified Treatment Policy
Super Learning
A Qualified Loss Function for Density Ratios
Simulation Study
Results
Mediation Analysis Setting
LMTP Setting
Discussion and Conclusion

Key Result

Theorem 2.1

Suppose the marginal distributions of $X_1$ given $X_2$ and $\lambda$ have the same support for different values of $\lambda$, $p_0(\lambda=1)>0$, $p_0(\lambda=0)>0$. $L(O,\psi)=-\mathbb{I}(\lambda=1)\log\psi(x_1,x_2)+\mathbb{I}(\lambda=0)\log\psi(x_1,x_2)$. $E_0L(O,\psi)$ will only be minimized whe

Figures (3)

Figure 1: Average Hold-Out Risks For Individual Learners and the Super Learner
Figure 2: True Ratio vs Super Learner Estimated Ratio at Different Values of $W\;(n=700)$
Figure 3: True Ratio vs Super Learner Estimated Ratio at Different Values of $W\;(n=2000)$

Theorems & Definitions (2)

Theorem 2.1
proof

A Density Ratio Super Learner

TL;DR

Abstract

A Density Ratio Super Learner

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (3)

Theorems & Definitions (2)