Table of Contents
Fetching ...

Multivariate root-n-consistent smoothing parameter free matching estimators and estimators of inverse density weighted expectations

Hajo Holzmann, Alexander Meister

TL;DR

This work delivers root-$n$-consistent, smoothing-parameter free estimators for inverse-density weighted functionals in multivariate settings by employing $K$-th order Voronoi tessellations with polynomial least-squares fits on each cell. The approach avoids nonparametric density or regression estimation, yet achieves the parametric rate under mild Hölder smoothness on the regression function $G$, with information-theoretic lower bounds showing some smoothness is required when $f_Z$ is unknown. The framework extends to estimating $\Psi$ and $\Phi$, provides applications to Berkson errors, random coefficient models, ATT/ATE, and transfer learning under covariate shift, and is supported by simulations illustrating practical performance. The results offer bias-corrected alternatives to traditional matching estimators and illuminate the fundamental trade-offs between design density, smoothness, and achievable rates in multivariate settings.

Abstract

Expected values weighted by the inverse of a multivariate density or, equivalently, Lebesgue integrals of regression functions with multivariate regressors occur in various areas of applications, including estimating average treatment effects, nonparametric estimators in random coefficient regression models or deconvolution estimators in Berkson errors-in-variables models. The frequently used nearest-neighbor and matching estimators suffer from bias problems in multiple dimensions. By using polynomial least squares fits on each cell of the $K^{\text{th}}$-order Voronoi tessellation for sufficiently large $K$, we develop novel modifications of nearest-neighbor and matching estimators which again converge at the parametric $\sqrt n $-rate under mild smoothness assumptions on the unknown regression function and without any smoothness conditions on the unknown density of the covariates. We stress that in contrast to competing methods for correcting for the bias of matching estimators, our estimators do not involve nonparametric function estimators and in particular do not rely on sample-size dependent smoothing parameters. We complement the upper bounds with appropriate lower bounds derived from information-theoretic arguments, which show that some smoothness of the regression function is indeed required to achieve the parametric rate. Simulations illustrate the practical feasibility of the proposed methods.

Multivariate root-n-consistent smoothing parameter free matching estimators and estimators of inverse density weighted expectations

TL;DR

This work delivers root--consistent, smoothing-parameter free estimators for inverse-density weighted functionals in multivariate settings by employing -th order Voronoi tessellations with polynomial least-squares fits on each cell. The approach avoids nonparametric density or regression estimation, yet achieves the parametric rate under mild Hölder smoothness on the regression function , with information-theoretic lower bounds showing some smoothness is required when is unknown. The framework extends to estimating and , provides applications to Berkson errors, random coefficient models, ATT/ATE, and transfer learning under covariate shift, and is supported by simulations illustrating practical performance. The results offer bias-corrected alternatives to traditional matching estimators and illuminate the fundamental trade-offs between design density, smoothness, and achievable rates in multivariate settings.

Abstract

Expected values weighted by the inverse of a multivariate density or, equivalently, Lebesgue integrals of regression functions with multivariate regressors occur in various areas of applications, including estimating average treatment effects, nonparametric estimators in random coefficient regression models or deconvolution estimators in Berkson errors-in-variables models. The frequently used nearest-neighbor and matching estimators suffer from bias problems in multiple dimensions. By using polynomial least squares fits on each cell of the -order Voronoi tessellation for sufficiently large , we develop novel modifications of nearest-neighbor and matching estimators which again converge at the parametric -rate under mild smoothness assumptions on the unknown regression function and without any smoothness conditions on the unknown density of the covariates. We stress that in contrast to competing methods for correcting for the bias of matching estimators, our estimators do not involve nonparametric function estimators and in particular do not rely on sample-size dependent smoothing parameters. We complement the upper bounds with appropriate lower bounds derived from information-theoretic arguments, which show that some smoothness of the regression function is indeed required to achieve the parametric rate. Simulations illustrate the practical feasibility of the proposed methods.
Paper Structure (23 sections, 12 theorems, 172 equations, 2 figures, 3 tables)

This paper contains 23 sections, 12 theorems, 172 equations, 2 figures, 3 tables.

Key Result

Theorem 1

Consider estimation of the functional $\Psi$ in eq:thefunctional in dimensions $d \geq 2$ under the Assumptions ass:deigndens and ass:boundeffect. Suppose that the regression function $G$ in eq:regressionfct belongs to the Hölder class $\mathcal{G}(l, \beta, C_\mathrm{H})$ in eq:hoelderclass with pa where the constant $C>0$ only depends on $L,K,\rho, \rho',\rho",\bar{\rho}, d$ and $\lambda (S^*)$.

Figures (2)

  • Figure 1: Plots of $\sqrt{n}(\hat{\Psi} - \Psi)$ in simulation scenario $f_1$ with $L=0$, $K=1$, $n=1000$ over $N=1000$ repetitions. Left: Density plot (black curve) and normal density with estimated parameters (red curve). Right: QQ-Plot against standard normal, and qqline.
  • Figure 2: Plots of $\sqrt{n}(\hat{\Psi} - \Psi)$ in simulation scenario $f_1$ with $L=1$, $K=6$, $n=1000$ over $N=1000$ repetitions. Left: Density plot (black curve) and normal density with estimated parameters (red curve). Right: QQ-Plot against standard normal, and qqline.

Theorems & Definitions (28)

  • Theorem 1
  • Remark 1: Choice of $L$ and $K$
  • Remark 2: Higher moments of the volume of $C(J)$
  • Theorem 2
  • Remark 3: Matching estimators
  • Theorem 3
  • Remark 4: Continuous design density and regression functions
  • Remark 5: Fixed design
  • Theorem 4
  • Lemma 6.1
  • ...and 18 more