Generalization for Least Squares Regression With Simple Spiked Covariances

Jiping Li; Rishi Sonthalia

Generalization for Least Squares Regression With Simple Spiked Covariances

Jiping Li, Rishi Sonthalia

TL;DR

This paper examines two simple models exhibiting spiked covariances and derives their generalization error in the asymptotic proportional regime, demonstrating that the eigenvector and eigenvalue corresponding to the spike significantly influence the generalization error.

Abstract

Random matrix theory has proven to be a valuable tool in analyzing the generalization of linear models. However, the generalization properties of even two-layer neural networks trained by gradient descent remain poorly understood. To understand the generalization performance of such networks, it is crucial to characterize the spectrum of the feature matrix at the hidden layer. Recent work has made progress in this direction by describing the spectrum after a single gradient step, revealing a spiked covariance structure. Yet, the generalization error for linear models with spiked covariances has not been previously determined. This paper addresses this gap by examining two simple models exhibiting spiked covariances. We derive their generalization error in the asymptotic proportional regime. Our analysis demonstrates that the eigenvector and eigenvalue corresponding to the spike significantly influence the generalization error.

Generalization for Least Squares Regression With Simple Spiked Covariances

TL;DR

Abstract

Generalization for Least Squares Regression With Simple Spiked Covariances

TL;DR

Abstract

Paper Structure

Table of Contents

Key Result

Figures (3)

Theorems & Definitions (61)