Table of Contents
Fetching ...

Algebraic and Statistical Properties of the Partially Regularized Ordinary Least Squares Interpolator

Letian Yang, Dennis Shen

Abstract

Modern deep learning has revealed a surprising statistical phenomenon known as benign overfitting, with high-dimensional linear regression being a prominent example. This paper contributes to ongoing research on the ordinary least squares (OLS) interpolator, focusing on the partial regression setting, where only a subset of coefficients is implicitly regularized. On the algebraic front, we extend Cochran's formula and the leave-one-out residual formula for the partial regularization framework. On the stochastic front, we leverage our algebraic results to design several homoskedastic variance estimators under the Gauss-Markov model. These estimators serve as a basis for conducting statistical inference, albeit with slight conservatism in their performance. Through simulations, we study the finite-sample properties of these variance estimators across various generative models.

Algebraic and Statistical Properties of the Partially Regularized Ordinary Least Squares Interpolator

Abstract

Modern deep learning has revealed a surprising statistical phenomenon known as benign overfitting, with high-dimensional linear regression being a prominent example. This paper contributes to ongoing research on the ordinary least squares (OLS) interpolator, focusing on the partial regression setting, where only a subset of coefficients is implicitly regularized. On the algebraic front, we extend Cochran's formula and the leave-one-out residual formula for the partial regularization framework. On the stochastic front, we leverage our algebraic results to design several homoskedastic variance estimators under the Gauss-Markov model. These estimators serve as a basis for conducting statistical inference, albeit with slight conservatism in their performance. Through simulations, we study the finite-sample properties of these variance estimators across various generative models.

Paper Structure

This paper contains 58 sections, 9 theorems, 57 equations, 9 figures.

Key Result

Theorem 1

If Assumption assump:partial_3 holds, then for any tuples $(\widehat{\boldsymbol{\alpha}}, \widehat{\boldsymbol{\gamma}}, \widehat{\boldsymbol{\tau}}) \in \mathcal{S}_1$, $(\widetilde{\boldsymbol{\alpha}}, \widetilde{\boldsymbol{\tau}}) \in \mathcal{S}_2$, and $(\widehat{\boldsymbol{\Delta}}, \wideh If the tuples $(\widehat{\boldsymbol{\alpha}}, \widehat{\boldsymbol{\gamma}}, \widehat{\boldsymbol{

Figures (9)

  • Figure 1: Biases in estimating the ATE for fully (red) and partially (green) regularized OLS interpolators. Here, $(n,p)=(80,100)$, $\tau = [0, \pm 1, \pm 2, \pm 4, \pm 6, \pm 8]$, $\boldsymbol{W}$ is generated from a spiked covariance model, $\boldsymbol{D} \sim \text{Bernoulli}(0.5)^n$, and standard normal noise is added. See supplementary materials for details.
  • Figure 2: Simulation results with fixed $p = 100$ and varying $n \in \{20,40,60,80,99\}$. The solid lines represent the average bias over 100 trials, with shading indicating $\pm$ one standard error.
  • Figure 3: Simulation results with fixed ratio $n/p=0.8$ and varying $p\in\{50,75,100,125,150\}$. The solid lines represent the average bias, with shading indicating $\pm$ one standard error.
  • Figure 4: Simulation results with fixed $p=100$, $n=80$, and varying $\sigma \in \{1, 2, 5, 7, 10 \}$. The solid lines represent the average bias, with shading indicating $\pm$ one standard error.
  • Figure 5: Simulation results with fixed $p=100$, $n=80$, and varying intercept magnitude $\beta_0 \in \{1, 2, 5, 7, 10 \}$. The solid lines represent the average bias, with shading indicating $\pm$ one standard error.
  • ...and 4 more figures

Theorems & Definitions (17)

  • Remark 1: Alternative expression
  • Theorem 1: Cochran's formula
  • Corollary 1
  • Proposition 1
  • Corollary 2
  • Theorem 2
  • Theorem 3
  • Remark 2: Omitting $\widehat{\sigma}_{\mathcal{W}}^2$
  • Remark 3: General takeaways
  • Definition 1
  • ...and 7 more