Dynamic Spectral Backpropagation for Efficient Neural Network Training
Mannmohan Muthuraman
TL;DR
DSBP addresses training efficiency under limited data and compute by projecting layerwise gradients onto the top-$k$ eigenvectors of covariances, reducing per-layer cost to $O(k d_l)$ and biasing updates toward flatter minima. It introduces five extensions—dynamic spectral inference, spectral architecture optimization, spectral meta learning, spectral transfer regularization, and Lie algebra inspired dynamics—grounded by a third-order stochastic differential equation and a PAC-Bayes generalization bound. Empirical results on CIFAR-10, Fashion-MNIST, MedMNIST, and Tiny ImageNet show DSBP consistently outperforms SAM, LoRA, and MAML in accuracy and training efficiency, with ablations highlighting the importance of $k$, $p$, and pruning. The work offers a scalable, theoretically grounded framework for robust, efficient training and points to future directions in scalability, fairness, robotics, and ethical deployment.
Abstract
Dynamic Spectral Backpropagation (DSBP) enhances neural network training under resource constraints by projecting gradients onto principal eigenvectors, reducing complexity and promoting flat minima. Five extensions are proposed, dynamic spectral inference, spectral architecture optimization, spectral meta learning, spectral transfer regularization, and Lie algebra inspired dynamics, to address challenges in robustness, fewshot learning, and hardware efficiency. Supported by a third order stochastic differential equation (SDE) and a PAC Bayes limit, DSBP outperforms Sharpness Aware Minimization (SAM), Low Rank Adaptation (LoRA), and Model Agnostic Meta Learning (MAML) on CIFAR 10, Fashion MNIST, MedMNIST, and Tiny ImageNet, as demonstrated through extensive experiments and visualizations. Future work focuses on scalability, bias mitigation, and ethical considerations.
