Ridge interpolators in correlated factor regression models -- exact risk analysis

Mihailo Stojnic

Ridge interpolators in correlated factor regression models -- exact risk analysis

Mihailo Stojnic

TL;DR

This work analyzes correlated factor regression models (FRMs) and evaluates ridge interpolators using Random Duality Theory to obtain exact, closed-form excess risk characterizations for GLS, ridge, and LS estimators. It confirms non-monotonic double-descent behavior in GLS risk with over-parameterization and shows that optimally tuned ridge regularization can smooth this effect, with ridge smoothing diminishing for high over-parameterization (ratios $>5$ to $>10$). The authors also establish a precise FRM–LRM connection and provide comprehensive numerical results that corroborate the theory, including uncorrelated scenarios and high-SNR regimes. Overall, the results offer rigorous insights into interpolation phenomena in FRMs and support the broader relevance of zero-training generalization in structured linear models with correlations.

Abstract

We consider correlated \emph{factor} regression models (FRM) and analyze the performance of classical ridge interpolators. Utilizing powerful \emph{Random Duality Theory} (RDT) mathematical engine, we obtain \emph{precise} closed form characterizations of the underlying optimization problems and all associated optimizing quantities. In particular, we provide \emph{excess prediction risk} characterizations that clearly show the dependence on all key model parameters, covariance matrices, loadings, and dimensions. As a function of the over-parametrization ratio, the generalized least squares (GLS) risk also exhibits the well known \emph{double-descent} (non-monotonic) behavior. Similarly to the classical linear regression models (LRM), we demonstrate that such FRM phenomenon can be smoothened out by the optimally tuned ridge regularization. The theoretical results are supplemented by numerical simulations and an excellent agrement between the two is observed. Moreover, we note that ``ridge smootenhing'' is often of limited effect already for over-parametrization ratios above $5$ and of virtually no effect for those above $10$. This solidifies the notion that one of the recently most popular neural networks paradigms -- \emph{zero-training (interpolating) generalizes well} -- enjoys wider applicability, including the one within the FRM estimation/prediction context.

Ridge interpolators in correlated factor regression models -- exact risk analysis

TL;DR

). The authors also establish a precise FRM–LRM connection and provide comprehensive numerical results that corroborate the theory, including uncorrelated scenarios and high-SNR regimes. Overall, the results offer rigorous insights into interpolation phenomena in FRMs and support the broader relevance of zero-training generalization in structured linear models with correlations.

Abstract

and of virtually no effect for those above

. This solidifies the notion that one of the recently most popular neural networks paradigms -- \emph{zero-training (interpolating) generalizes well} -- enjoys wider applicability, including the one within the FRM estimation/prediction context.

Paper Structure (25 sections, 11 theorems, 204 equations, 2 figures, 1 table)

This paper contains 25 sections, 11 theorems, 204 equations, 2 figures, 1 table.

Introduction
Correlated factor regression models --- mathematical setup
FRM versus LRM estimation
Key problem infrastructure features
Linearly related dimensions
Underlying statistics
Classical linear estimators
Related literature and our contributions
Precise excess risk analysis
Excess risk of the GLS interpolator
GLS excess risk via RDT
Compact risk form - GLS
Excess risk of the ridge estimator
Ridge estimator excess risk via RDT
Compact risk form -- Ridge estimator
...and 10 more sections

Key Result

Lemma 1

(Algebraic optimization representation) Let $V\in{\mathbb R}^{n\times n}$, $\overline{U}\in{\mathbb R}^{m\times m}$, and $\overline{\overline{V}}\in{\mathbb R}^{n\times n}$ be three given unitary (orthogonal) matrices and let $\Sigma\in{\mathbb R}^{n\times n}$, $\overline{\Sigma}\in{\mathbb R}^{m\ti Then

Figures (2)

Figure 1: Excess risk -- low SNR; Covariance matrices are: $A={\mathcal{A}}(q)$, and $\overline{\overline{A}}={\mathcal{A}}(q_e)$; $q=0.5,q_e=0.3$.
Figure 2: Excess risk; uncorrelated factors and noise and scaled unitary loadings

Theorems & Definitions (22)

Lemma 1
proof
Theorem 1
proof
Lemma 2
proof
Theorem 2
proof
Corollary 1
proof
...and 12 more

Ridge interpolators in correlated factor regression models -- exact risk analysis

TL;DR

Abstract

Ridge interpolators in correlated factor regression models -- exact risk analysis

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (2)

Theorems & Definitions (22)