Table of Contents
Fetching ...

Subscedastic weighted least squares estimates

Jordan Bryan, Haibo Zhou, Didong Li

TL;DR

This paper characterizes when feasible weighted least squares (FLS) with fixed or random weights can outperform ordinary least squares (OLS) in a heteroscedastic linear regression. It introduces the subscedastic set $\mathcal{C}^p_{\boldsymbol \Omega}$, defined via a determinant-based inequality on the covariance function $H_{\mathbf X}$, and derives a sharp pairwise-ratio condition $1 \le \tilde{\omega}_i/\tilde{\omega}_j \le 2(\omega_i/\omega_j) - 1$ that characterizes membership in $\mathcal{C}^p_{\boldsymbol \Omega}$ (extending from $p=1$ to general $p$). The authors show that fixed-weight FLS can outperform OLS under these subscedastic criteria and connect this to the behavior of certain robust estimators; they derive asymptotic covariance forms for the $t$-based estimator $\hat{\boldsymbol \beta}_T$ and the Huber estimator, establishing upper bounds on their variances relative to OLS. Through Monte Carlo and real-data analyses, they demonstrate that $t$-derived estimates often achieve substantial efficiency gains when heteroscedasticity is moderate to large, and provide practical inference via empirical covariance-based confidence intervals. Overall, the work offers a principled framework for designing feasible weights and links robust regression methods to subscedastic efficiency, with implications for inference under heteroscedasticity.

Abstract

In the heteroscedastic linear model, the weighted least squares (WLS) estimate of the model coefficients is more efficient than the ordinary least squares (OLS) esti- mate. However, the practical application of WLS is challenging because it requires knowledge of the error variances. Feasible weighted least squares (FLS) estimates, which use approximations of the variances when they are unknown, may either be more or less efficient than the OLS estimate depending on the quality of the approx- imation. A direct comparison between FLS and OLS has significant implications for the application of regression analysis in varied fields, yet such a comparison remains an unresolved challenge. In this study, we address this challenge by identifying the conditions under which FLS estimates using fixed weights demonstrate greater effi- ciency than the OLS estimate. These conditions provide guidance for the design of feasible estimates using random weights. They also shed light on how certain robust regression estimates behave with respect to the linear model with normal errors of unequal variance.

Subscedastic weighted least squares estimates

TL;DR

This paper characterizes when feasible weighted least squares (FLS) with fixed or random weights can outperform ordinary least squares (OLS) in a heteroscedastic linear regression. It introduces the subscedastic set , defined via a determinant-based inequality on the covariance function , and derives a sharp pairwise-ratio condition that characterizes membership in (extending from to general ). The authors show that fixed-weight FLS can outperform OLS under these subscedastic criteria and connect this to the behavior of certain robust estimators; they derive asymptotic covariance forms for the -based estimator and the Huber estimator, establishing upper bounds on their variances relative to OLS. Through Monte Carlo and real-data analyses, they demonstrate that -derived estimates often achieve substantial efficiency gains when heteroscedasticity is moderate to large, and provide practical inference via empirical covariance-based confidence intervals. Overall, the work offers a principled framework for designing feasible weights and links robust regression methods to subscedastic efficiency, with implications for inference under heteroscedasticity.

Abstract

In the heteroscedastic linear model, the weighted least squares (WLS) estimate of the model coefficients is more efficient than the ordinary least squares (OLS) esti- mate. However, the practical application of WLS is challenging because it requires knowledge of the error variances. Feasible weighted least squares (FLS) estimates, which use approximations of the variances when they are unknown, may either be more or less efficient than the OLS estimate depending on the quality of the approx- imation. A direct comparison between FLS and OLS has significant implications for the application of regression analysis in varied fields, yet such a comparison remains an unresolved challenge. In this study, we address this challenge by identifying the conditions under which FLS estimates using fixed weights demonstrate greater effi- ciency than the OLS estimate. These conditions provide guidance for the design of feasible estimates using random weights. They also shed light on how certain robust regression estimates behave with respect to the linear model with normal errors of unequal variance.
Paper Structure (25 sections, 17 theorems, 217 equations, 7 figures, 2 tables)

This paper contains 25 sections, 17 theorems, 217 equations, 7 figures, 2 tables.

Key Result

Proposition 1

Let $g : \mathbb{R}_+ \rightarrow \mathbb{R}_{+}$ be a function such that for each $\omega > 0$ Then if $\tilde{\omega}_i = g(\omega_i)$ for $i \in \{1, \dots, n\}$, $\tilde{\boldsymbol \Omega} \in \mathcal{C}^1_{\boldsymbol \Omega}$.

Figures (7)

  • Figure 1: Three functions satisfying \ref{['eq:grm']}. From left to right, the functions are $\omega^{1/q}$, $\sqrt{w}/\{\int_{-k}^k \mathrm{exp}(-z^2 / 2\omega) dz \}$, and $g_{1, \nu}(\omega)$, where this last function is defined in Theorem \ref{['thm:asymvar']}. For the purposes of visualization, the functions have been normalized to attain a maximum value of 1 at $\omega = 10$.
  • Figure 2: Theoretical behavior of $t$-derived estimates. On the left, a comparison of the standardized generalized variance of three estimates relative to that of the WLS estimate for small degrees of freedom. On the right, the same comparison is made for large degrees of freedom.
  • Figure 3: Results of the first simulation study using the longnecker_association_2001 dataset, which uses a sample of inverse gamma distributed random variables to specify the heteroscedasticity.
  • Figure 4: Results of the second simulation study using the longnecker_association_2001 dataset, which uses a parametric model to specify the heteroscedasticity.
  • Figure 5: Results of the third simulation study using the longnecker_association_2001 dataset, which simulates from a modified version of the mixed effects model in \ref{['eq:mixmod']}.
  • ...and 2 more figures

Theorems & Definitions (34)

  • Proposition 1
  • Proposition 2
  • Proposition 3
  • Theorem 1
  • Corollary 1
  • Corollary 2
  • Corollary 3
  • Proposition 4
  • Theorem 2
  • Corollary 4
  • ...and 24 more