Table of Contents
Fetching ...

Direct Debiased Machine Learning via Bregman Divergence Minimization

Masahiro Kato

TL;DR

This work introduces Direct Debiased Machine Learning (DDML), a unified framework that tackles first-stage bias when using machine-learned regression functions in estimating causal and policy parameters. It pairs Neyman targeted estimation (to minimize the discrepancy between the oracle score and its plug-in version) with generalized Riesz regression (to learn the Riesz representer via Bregman-divergence minimization), enabling covariate balancing and automatic efficiency gains. The framework encompasses ATE, ATT, AME, and covariate-shift problems, linking squared-loss Riesz regression, KL-type entropy balancing, and direct density-ratio estimation as special cases. The DDML approach yields minimax-optimal nuisance estimation rates and, under mild conditions, asymptotically efficient, normal estimators for target parameters, while providing practical mechanisms for automatic covariate balancing across model classes. Overall, DDML unifies diverse strands of semiparametric efficiency and causal inference into a coherent methodology with broad applicability and improved finite-sample performance.

Abstract

We develop a direct debiased machine learning framework comprising Neyman targeted estimation and generalized Riesz regression. Our framework unifies Riesz regression for automatic debiased machine learning, covariate balancing, targeted maximum likelihood estimation (TMLE), and density-ratio estimation. In many problems involving causal effects or structural models, the parameters of interest depend on regression functions. Plugging regression functions estimated by machine learning methods into the identifying equations can yield poor performance because of first-stage bias. To reduce such bias, debiased machine learning employs Neyman orthogonal estimating equations. Debiased machine learning typically requires estimation of the Riesz representer and the regression function. For this problem, we develop a direct debiased machine learning framework with an end-to-end algorithm. We formulate estimation of the nuisance parameters, the regression function and the Riesz representer, as minimizing the discrepancy between Neyman orthogonal scores computed with known and unknown nuisance parameters, which we refer to as Neyman targeted estimation. Neyman targeted estimation includes Riesz representer estimation, and we measure discrepancies using the Bregman divergence. The Bregman divergence encompasses various loss functions as special cases, where the squared loss yields Riesz regression and the Kullback-Leibler divergence yields entropy balancing. We refer to this Riesz representer estimation as generalized Riesz regression. Neyman targeted estimation also yields TMLE as a special case for regression function estimation. Furthermore, for specific pairs of models and Riesz representer estimation methods, we can automatically obtain the covariate balancing property without explicitly solving the covariate balancing objective.

Direct Debiased Machine Learning via Bregman Divergence Minimization

TL;DR

This work introduces Direct Debiased Machine Learning (DDML), a unified framework that tackles first-stage bias when using machine-learned regression functions in estimating causal and policy parameters. It pairs Neyman targeted estimation (to minimize the discrepancy between the oracle score and its plug-in version) with generalized Riesz regression (to learn the Riesz representer via Bregman-divergence minimization), enabling covariate balancing and automatic efficiency gains. The framework encompasses ATE, ATT, AME, and covariate-shift problems, linking squared-loss Riesz regression, KL-type entropy balancing, and direct density-ratio estimation as special cases. The DDML approach yields minimax-optimal nuisance estimation rates and, under mild conditions, asymptotically efficient, normal estimators for target parameters, while providing practical mechanisms for automatic covariate balancing across model classes. Overall, DDML unifies diverse strands of semiparametric efficiency and causal inference into a coherent methodology with broad applicability and improved finite-sample performance.

Abstract

We develop a direct debiased machine learning framework comprising Neyman targeted estimation and generalized Riesz regression. Our framework unifies Riesz regression for automatic debiased machine learning, covariate balancing, targeted maximum likelihood estimation (TMLE), and density-ratio estimation. In many problems involving causal effects or structural models, the parameters of interest depend on regression functions. Plugging regression functions estimated by machine learning methods into the identifying equations can yield poor performance because of first-stage bias. To reduce such bias, debiased machine learning employs Neyman orthogonal estimating equations. Debiased machine learning typically requires estimation of the Riesz representer and the regression function. For this problem, we develop a direct debiased machine learning framework with an end-to-end algorithm. We formulate estimation of the nuisance parameters, the regression function and the Riesz representer, as minimizing the discrepancy between Neyman orthogonal scores computed with known and unknown nuisance parameters, which we refer to as Neyman targeted estimation. Neyman targeted estimation includes Riesz representer estimation, and we measure discrepancies using the Bregman divergence. The Bregman divergence encompasses various loss functions as special cases, where the squared loss yields Riesz regression and the Kullback-Leibler divergence yields entropy balancing. We refer to this Riesz representer estimation as generalized Riesz regression. Neyman targeted estimation also yields TMLE as a special case for regression function estimation. Furthermore, for specific pairs of models and Riesz representer estimation methods, we can automatically obtain the covariate balancing property without explicitly solving the covariate balancing objective.

Paper Structure

This paper contains 56 sections, 1 theorem, 110 equations, 1 figure.

Key Result

Theorem 2.1

For (eq:basic_original_target) and (eq:basic_direct_target), $\pi^*(\gamma) = \pi^\dagger(\gamma)$ holds. We also have

Figures (1)

  • Figure 1: Concept of direct debiased machine learning.

Theorems & Definitions (1)

  • Theorem 2.1