Equality between two general ridge estimators and equivalence of their residual sums of squares
Hirai Mukasa, Koji Tsukuda
TL;DR
This work addresses when two general ridge estimators in the linear model $\boldsymbol{y} = \boldsymbol{X} \boldsymbol{\beta} + \boldsymbol{\varepsilon}$ are identical and when their residual sums of squares (RSS) agree under a known covariance $\boldsymbol{\Omega}$. It develops a set of necessary and sufficient conditions, including a main result that $\hat{\boldsymbol{\beta}}(\boldsymbol{\Omega}, \boldsymbol{K}_1) = \hat{\boldsymbol{\beta}}(\boldsymbol{I}_n, \boldsymbol{K}_2)$ for all $\boldsymbol{y}$ iff there exists $\boldsymbol{G}$ with $\boldsymbol{X} = \boldsymbol{\Omega} \boldsymbol{X} \boldsymbol{G}$ and $\boldsymbol{K}_1 = \boldsymbol{K}_2 \boldsymbol{G}$; in the PD case this reduces to $\boldsymbol{X} = \boldsymbol{\Omega} \boldsymbol{X} \boldsymbol{K}_2^{-1} \boldsymbol{K}_1$. A second main result characterizes the RSS equality through three explicit conditions, including a bias-consistency relation and the requirement $\boldsymbol{\Delta} = (\boldsymbol{Z}^T \boldsymbol{Z})^{-1}$, with $\boldsymbol{A}=(\boldsymbol{\Gamma}-\boldsymbol{\Xi}\boldsymbol{\Delta}^{-1}\boldsymbol{\Xi}^T)^{-1}$. The paper also provides corollaries and concrete examples linking estimator equivalence to covariance structure and bias, contributing to understanding when ridge-based inferences are invariant to different regularization or error-structure specifications.
Abstract
General ridge estimators are typical linear estimators in a general linear model. The class of them includes some shrinkage estimators in addition to classical linear unbiased estimators such as the ordinary least squares estimator and the weighted least squares estimator. We derive necessary and sufficient conditions under which two general ridge estimators coincide. In particular, two noteworthy conditions are added to those from previous studies. The first condition is given as a seemingly column space relationship to the covariance matrix of the error term, and the second one is based on the biases of general ridge estimators. Another problem studied in this paper is to derive an equivalence condition such that equality between two residual sums of squares holds when general ridge estimators are considered.
