Table of Contents
Fetching ...

Automatic Debiased Machine Learning via Riesz Regression

Victor Chernozhukov, Whitney K. Newey, Victor Quintas-Martinez, Vasilis Syrgkanis

TL;DR

It is found in Monte Carlo examples that automatic debiasing sometimes performs better than debiasing via inverse propensity scores and never worse, and Finite sample mean square error bounds for Riesz regression estimators and asymptotic theory are given.

Abstract

A variety of interesting parameters may depend on high dimensional regressions. Machine learning can be used to estimate such parameters. However estimators based on machine learners can be severely biased by regularization and/or model selection. Debiased machine learning uses Neyman orthogonal estimating equations to reduce such biases. Debiased machine learning generally requires estimation of unknown Riesz representers. A primary innovation of this paper is to provide Riesz regression estimators of Riesz representers that depend on the parameter of interest, rather than explicit formulae, and that can employ any machine learner, including neural nets and random forests. End-to-end algorithms emerge where the researcher chooses the parameter of interest and the machine learner and the debiasing follows automatically. Another innovation here is debiased machine learners of parameters depending on generalized regressions, including high-dimensional generalized linear models. An empirical example of automatic debiased machine learning using neural nets is given. We find in Monte Carlo examples that automatic debiasing sometimes performs better than debiasing via inverse propensity scores and never worse. Finite sample mean square error bounds for Riesz regression estimators and asymptotic theory are also given.

Automatic Debiased Machine Learning via Riesz Regression

TL;DR

It is found in Monte Carlo examples that automatic debiasing sometimes performs better than debiasing via inverse propensity scores and never worse, and Finite sample mean square error bounds for Riesz regression estimators and asymptotic theory are given.

Abstract

A variety of interesting parameters may depend on high dimensional regressions. Machine learning can be used to estimate such parameters. However estimators based on machine learners can be severely biased by regularization and/or model selection. Debiased machine learning uses Neyman orthogonal estimating equations to reduce such biases. Debiased machine learning generally requires estimation of unknown Riesz representers. A primary innovation of this paper is to provide Riesz regression estimators of Riesz representers that depend on the parameter of interest, rather than explicit formulae, and that can employ any machine learner, including neural nets and random forests. End-to-end algorithms emerge where the researcher chooses the parameter of interest and the machine learner and the debiasing follows automatically. Another innovation here is debiased machine learners of parameters depending on generalized regressions, including high-dimensional generalized linear models. An empirical example of automatic debiased machine learning using neural nets is given. We find in Monte Carlo examples that automatic debiasing sometimes performs better than debiasing via inverse propensity scores and never worse. Finite sample mean square error bounds for Riesz regression estimators and asymptotic theory are also given.

Paper Structure

This paper contains 16 sections, 8 theorems, 99 equations, 1 figure, 4 tables.

Key Result

Theorem 2.1

Let $\delta_{n}$ be an upper bound on the critical radius of $\mathrm{star}(\mathcal{A}_n-\alpha_{0})$ and $\mathrm{star}(m\circ \mathcal{A}_n-m\circ\alpha_{0})$. If Assumptions ass1 and ass2 are satisfied then it follows that with probability $1-\zeta$, for some universal constant $C$,

Figures (1)

  • Figure 1: Heterogeneous Effects

Theorems & Definitions (23)

  • Example 1: Average Treatment Effect
  • Example 2: Average Marginal Effect
  • Example 3: Average Policy Effect
  • Theorem 2.1
  • Corollary 2.2
  • Theorem 2.3
  • Corollary 2.4
  • Remark 3.1
  • Example 4: Inverse Propensity Score Weighting
  • Theorem 3.2
  • ...and 13 more