Inference for Projection Parameters in Linear Regression: beyond $d = o(n^{1/2})$

Woonyoung Chang; Arun Kumar Kuchibhotla; Alessandro Rinaldo

Inference for Projection Parameters in Linear Regression: beyond $d = o(n^{1/2})$

Woonyoung Chang, Arun Kumar Kuchibhotla, Alessandro Rinaldo

TL;DR

This work tackles inference for projection parameters in high-dimensional, possibly misspecified linear regression by developing a bias-corrected LS estimator that remains $ oot n$-consistent when $d=o(n^{2/3})$ (up to polylog factors). It provides explicit finite-sample Berry–Esseen bounds for both unnormalized and studentized linear contrasts, enabling accurate distributional approximations without relying on variance estimation. The authors introduce three inference methods—HulC, $t$-statistic based, and bootstrap—that yield valid confidence regions under minimal moment assumptions and without requiring $d$ to scale as $o( oot n 2)$. They also establish consistency results for the sandwich variance estimator and discuss extensions to push the dimension range further via higher-order $U$-statistics. The numerical studies illustrate the practical performance of the proposed approaches in both well-specified and misspecified settings, highlighting robustness and competitive coverage across a range of $(n,d)$ pairs.

Abstract

We consider the problem of inference for projection parameters in linear regression with increasing dimensions. This problem has been studied under a variety of assumptions in the literature. The classical asymptotic normality result for the least squares estimator of the projection parameter only holds when the dimension $d$ of the covariates is of a smaller order than $n^{1/2}$, where $n$ is the sample size. Traditional sandwich estimator-based Wald intervals are asymptotically valid in this regime. In this work, we propose a bias correction for the least squares estimator and prove the asymptotic normality of the resulting debiased estimator. Precisely, we provide an explicit finite sample Berry Esseen bound on the Normal approximation to the law of the linear contrasts of the proposed estimator normalized by the sandwich standard error estimate. Our bound, under only finite moment conditions on covariates and errors, tends to 0 as long as $d = o(n^{2/3})$ up to the polylogarithmic factors. Furthermore, we leverage recent methods of statistical inference that do not require an estimator of the variance to perform asymptotically valid statistical inference and that leads to a sharper miscoverage control compared to Wald's. We provide a discussion of how our techniques can be generalized to increase the allowable range of $d$ even further.

Inference for Projection Parameters in Linear Regression: beyond $d = o(n^{1/2})$

TL;DR

This work tackles inference for projection parameters in high-dimensional, possibly misspecified linear regression by developing a bias-corrected LS estimator that remains

-consistent when

(up to polylog factors). It provides explicit finite-sample Berry–Esseen bounds for both unnormalized and studentized linear contrasts, enabling accurate distributional approximations without relying on variance estimation. The authors introduce three inference methods—HulC,

-statistic based, and bootstrap—that yield valid confidence regions under minimal moment assumptions and without requiring

to scale as

. They also establish consistency results for the sandwich variance estimator and discuss extensions to push the dimension range further via higher-order

-statistics. The numerical studies illustrate the practical performance of the proposed approaches in both well-specified and misspecified settings, highlighting robustness and competitive coverage across a range of

pairs.

Abstract

of the covariates is of a smaller order than

, where

is the sample size. Traditional sandwich estimator-based Wald intervals are asymptotically valid in this regime. In this work, we propose a bias correction for the least squares estimator and prove the asymptotic normality of the resulting debiased estimator. Precisely, we provide an explicit finite sample Berry Esseen bound on the Normal approximation to the law of the linear contrasts of the proposed estimator normalized by the sandwich standard error estimate. Our bound, under only finite moment conditions on covariates and errors, tends to 0 as long as

up to the polylogarithmic factors. Furthermore, we leverage recent methods of statistical inference that do not require an estimator of the variance to perform asymptotically valid statistical inference and that leads to a sharper miscoverage control compared to Wald's. We provide a discussion of how our techniques can be generalized to increase the allowable range of

even further.

Paper Structure (56 sections, 31 theorems, 266 equations, 19 figures, 1 table, 1 algorithm)

This paper contains 56 sections, 31 theorems, 266 equations, 19 figures, 1 table, 1 algorithm.

Introduction
Related Works
Asymptotic Normality and Berry Esseen bound for the Least Square Estimator
De-biasing of the Least Square Estimator
Variance Estimation
Contributions
Organization
Problem setup and Assumptions
Projection parameters
Additional Notations
Assumptions
Methods and Results
Approximation of the Sample Gram Matrix and Bias Characterization
Bias Estimation
Berry Esseen Bound for the Unnormalized Bias-corrected Estimator
...and 41 more sections

Key Result

Theorem 1

Suppose that Assumption asmp:2 holds for $q>3$ and that Assumption asmp:4 holds.

Figures (19)

Figure 1: Comparison of coverages and widths of confidence intervals; wild bootstrap based on Jackknife estimators, wild bootstrap based on proposed bias-corrected estimators, HulC using proposed bias-corrected estimators, and $t$-statistic based inference using proposed bias-corrected estimators, and Wald confidence interval based on proposed estimators. The empirical coverages and widths of CI are computed based on 1000 replications. To attain a 0.95-level HULC confidence interval, Algortihm \ref{['alg:1']} requires data to be split into at most six subsets. This only allows the dimension to increase by $\approx$320 in order to ensure that the least square estimator is well-defined within a subset of data. Furthermore, $t$-statistics-based inference also requires data splitting, and we used the same data split as employed in HulC CI for the sake of simplicity.
Figure 2: Comparison of coverages and widths of confidence intervals obtained from 5 different methods under the misspecified model with $n=20000$, $\theta=\mathbf e_1$. The model parameter $\rho$ is set to $0.0$ (top) and $0.5$ (bottom). See Figure \ref{['fig:1']} for the abbreviations of methods. The empirical coverages and widths of CI are computed based on 1000. It is noteworthy that $n^{1/2}\approx 141$ and $n^{2/3}\approx 737$.
Figure 3: Empirical coverages and widths of 95% confidence intervals under a well-specified case with $n=1000$. Top row: Comparison of coverages and widths of confidence intervals; wild bootstrap based on Jackknife estimators, wild bootstrap based on proposed bias-corrected estimators, HulC using proposed bias-corrected estimators, and $t$-statistic based inference using proposed bias-corrected estimators, and Wald confidence interval based on proposed estimators. Bottom row: Comparison of coverages and widths of confidence intervals obtained from five different bootstrap methods; wild bootstrap CI based on Jackknife estimators, wild bootstrap CI based on proposed estimators, resample bootstrap CI based on proposed estimators, wild bootstrap CI based on OLS, resample bootstrap CI based on OLS. The empirical coverages and widths of CI are computed based on 1000 replications.
Figure 4: Empirical coverages and widths of proposed confidence intervals under a well-specified case with $n=2000$. See Figure \ref{['fig:D.1.1']} for the abbreviations of inferential methods.
Figure 5: Comparison of coverages and widths of confidence intervals for $\theta^\top\beta$ under the misspecified model with $n=1000$, $\theta=\mathbf e_1$, and $\rho=0.0$.
...and 14 more figures

Theorems & Definitions (60)

Theorem 1
Remark 1
Remark 2
Remark 3
Theorem 2
Theorem 3
Corollary 1
Theorem 4
Remark 4
Theorem 5
...and 50 more

Inference for Projection Parameters in Linear Regression: beyond $d = o(n^{1/2})$

TL;DR

Abstract

Inference for Projection Parameters in Linear Regression: beyond $d = o(n^{1/2})$

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (19)

Theorems & Definitions (60)