Neural network initialization with nonlinear characteristics and information on spectral bias
Hikaru Homma, Jun Ohkubo
TL;DR
The paper addresses the impact of initialization on neural network training by integrating spectral-bias information into SWIM-based parameter initialization. It introduces a per-layer scheduling of the nonlinearity scale factors $s_{1,l}$ (with $s_{2,l}=\tfrac{1}{2}s_{1,l}$) to encode coarse information in early layers and fine details in later layers, yielding improved performance on both a 1D regression task and MNIST classification without gradient-based training. Empirical results show that the proposed ordered scheduling outperforms the original SWIM and reversed schemes when network width is large, highlighting the practical value of leveraging intrinsic spectral properties. The work suggests future extensions to other architectures and hyperparameter optimization to further harness spectral-bias effects in data-driven initializations.
Abstract
Initialization of neural network parameters, such as weights and biases, has a crucial impact on learning performance; if chosen well, we can even avoid the need for additional training with backpropagation. For example, algorithms based on the ridgelet transform or the SWIM (sampling where it matters) concept have been proposed for initialization. On the other hand, it is well-known that neural networks tend to learn coarse information in the earlier layers. The feature is called spectral bias. In this work, we investigate the effects of utilizing information on the spectral bias in the initialization of neural networks. Hence, we propose a framework that adjusts the scale factors in the SWIM algorithm to capture low-frequency components in the early-stage hidden layers and to represent high-frequency components in the late-stage hidden layers. Numerical experiments on a one-dimensional regression task and the MNIST classification task demonstrate that the proposed method outperforms the conventional initialization algorithms. This work clarifies the importance of intrinsic spectral properties in learning neural networks, and the finding yields an effective parameter initialization strategy that enhances their training performance.
