Debiased Nonparametric Regression for Statistical Inference and Distributionally Robustness

Masahiro Kato

Debiased Nonparametric Regression for Statistical Inference and Distributionally Robustness

Masahiro Kato

TL;DR

This work tackles the lack of statistical-inference guarantees for modern nonparametric regression by introducing a model-free debiasing approach that yields pointwise and uniform risk convergence and asymptotic normality under mild Hölder smoothness. The method proceeds in three stages: (i) estimate the target function $f_0$ with a smooth estimator $\widehat{f}_n$, (ii) estimate the conditional expected residual $\mathbb{E}[Y-f(X)\mid X]$ via local polynomial regression to obtain $\widehat{b}_n$, and (iii) form $\widetilde{f}_n=\widehat{f}_n+\widehat{b}_n$; this debiased estimator enjoys closed-form computation, bias-variance control, and double robustness. Key theoretical contributions show that, with $f_0-\widehat{f}_n\in\Sigma(s,L)$, the estimator achieves pointwise MSE rate $O(n^{-s/(2s+1)})$, $\,\sqrt{n h_n}\, (\widetilde{f}_n(x_0)-f_0(x_0))\to_d \mathcal{N}(0,V(x_0))$, and uniform convergence $\|\widetilde{f}_n-f_0\|_ty^2=O(((\log n)/n)^{2s/(2s+1)})$, while maintaining a double-robust property. This yields practical, distributionally robust inference for nonparametric regression and broadens applicability to modern ML estimators under covariate shift.

Abstract

This study proposes a debiasing method for smooth nonparametric estimators. While machine learning techniques such as random forests and neural networks have demonstrated strong predictive performance, their theoretical properties remain relatively underexplored. In particular, many modern algorithms lack guarantees of pointwise and uniform risk convergence, as well as asymptotic normality. These properties are essential for statistical inference and robust estimation and have been well-established for classical methods such as Nadaraya-Watson regression. To ensure these properties for various nonparametric regression estimators, we introduce a model-free debiasing method. By incorporating a correction term that estimates the conditional expected residual of the original estimator, or equivalently, its estimation error, into the initial nonparametric regression estimator, we obtain a debiased estimator that satisfies pointwise and uniform risk convergence, along with asymptotic normality, under mild smoothness conditions. These properties facilitate statistical inference and enhance robustness to covariate shift, making the method broadly applicable to a wide range of nonparametric regression problems.

Debiased Nonparametric Regression for Statistical Inference and Distributionally Robustness

TL;DR

with a smooth estimator

, (ii) estimate the conditional expected residual

via local polynomial regression to obtain

, and (iii) form

; this debiased estimator enjoys closed-form computation, bias-variance control, and double robustness. Key theoretical contributions show that, with

, the estimator achieves pointwise MSE rate

, and uniform convergence

, while maintaining a double-robust property. This yields practical, distributionally robust inference for nonparametric regression and broadens applicability to modern ML estimators under covariate shift.

Abstract

Paper Structure (24 sections, 8 theorems, 42 equations)

This paper contains 24 sections, 8 theorems, 42 equations.

Introduction
Notation.
Content of this study
Related work
The debiased estimator
The debiased nonparametric regression with local polynomial conditional expected residual estimation
First-stage nonparametric regression
Second-stage conditional expected residual estimation
Convergence analysis
Closed-form solution
Bias and variance decomposition
Pointwise convergence of the MSE
Pointwise asymptotic normality
Uniform convergence
Double robustness
...and 9 more sections

Key Result

Theorem 4.2

Let $s, L, C, C_1, C_2, C_3 > 0$ be constants independent of $f_0$ and $n$. Let $h_n$ be the bandwidth of the local polynomial estimator. For every $x \in {\mathcal{X}}$, the following hold: If $f_0 - \widehat{f}_n$ belongs to the Hölder class $\Sigma(s, L)$ almost surely as $n\to \infty$, then for any $\varepsilon > 0$ and for all $x_0 \in {\mathcal{X}}$, there exists $n_0 > 0$ such that for all

Theorems & Definitions (13)

Definition 4.1: Hölder class
Theorem 4.2: Bias and variance decomposition
Theorem 4.3: Pointwise MSE convergence
Corollary 4.4: MSE over the distribution of $X$
Theorem 4.5: Asymptotic normality
Theorem 4.6: Uniform convergence
Theorem 4.7: Double robustness
Lemma A.1: From Proposition 1.12 in Tsybakov2008
Lemma A.2
proof
...and 3 more

Debiased Nonparametric Regression for Statistical Inference and Distributionally Robustness

TL;DR

Abstract

Debiased Nonparametric Regression for Statistical Inference and Distributionally Robustness

Authors

TL;DR

Abstract

Table of Contents

Key Result

Theorems & Definitions (13)