A Generalized Spectral Framework to Expain Neural Scaling and Compression Dynamics
Yizhou Zhang
TL;DR
This work introduces a generalized spectral framework that unifies neural learning dynamics and compression phenomena through a flexible evolution function g(λ,t;β) parameterized by elasticity ρ. By linking the learning frontier, loss decay, and compression robustness within a single template, it recovers kernel/NTK and feature-learning limits as special cases and derives a universal complementarity between learning and compression. The theory recasts pruning as spectral truncation and quantization as spectral perturbation, predicting a consistent loss- and density-based scaling and a Densing Law where effective spectral density grows with compute. The framework offers concrete predictions for how model density, loss decay, and compression sensitivity evolve under compute, and outlines key open questions for extending the theory to multi-modal spectra and inverse identification from data.
Abstract
Empirical scaling laws describe how test loss and other performance metrics depend on model size, dataset size, and compute. While such laws are consistent within specific regimes, apparently distinct scaling behaviors have been reported for related settings such as model compression. Motivated by recent progress in spectral analyses of neural representations, this paper develops a \emph{generalized spectral framework} that unifies learning dynamics and compression phenomena under a common functional ansatz. We generalize the spectral evolution function from the linear kernel form $g(λt)=λt$ to an asymptotically polynomial function $g(λ,t;β)$, characterized by an effective spectral--temporal elasticity $ρ(β)$. This framework recovers existing lazy and feature-learning theories as special cases and yields an invariant relation between learning and compression
