Table of Contents
Fetching ...

Comparing Spectral Bias and Robustness For Two-Layer Neural Networks: SGD vs Adaptive Random Fourier Features

Aku Kammonen, Lisi Liang, Anamika Pandey, Raúl Tempone

TL;DR

This paper investigates how the choice of training algorithm affects spectral bias and robustness in a two-layer neural network. It contrasts SGD with adaptive random Fourier features (ARFF) and demonstrates that ARFF can reduce spectral bias toward zero by adaptively sampling Fourier frequencies, formalized through $SB = (\mathcal{E}_{high}-\mathcal{E}_{low})/(\mathcal{E}_{high}+\mathcal{E}_{low})$. Experimental results on function reconstruction show ARFF achieving spectral unbiasedness, while SGD remains spectrally biased. In MNIST (and CIFAR-10) experiments, ARFF-based models exhibit enhanced robustness to sparse additive perturbations, particularly when using noisy validation data with early stopping, highlighting a practical path to improved reliability via frequency-aware training.

Abstract

We present experimental results highlighting two key differences resulting from the choice of training algorithm for two-layer neural networks. The spectral bias of neural networks is well known, while the spectral bias dependence on the choice of training algorithm is less studied. Our experiments demonstrate that an adaptive random Fourier features algorithm (ARFF) can yield a spectral bias closer to zero compared to the stochastic gradient descent optimizer (SGD). Additionally, we train two identically structured classifiers, employing SGD and ARFF, to the same accuracy levels and empirically assess their robustness against adversarial noise attacks.

Comparing Spectral Bias and Robustness For Two-Layer Neural Networks: SGD vs Adaptive Random Fourier Features

TL;DR

This paper investigates how the choice of training algorithm affects spectral bias and robustness in a two-layer neural network. It contrasts SGD with adaptive random Fourier features (ARFF) and demonstrates that ARFF can reduce spectral bias toward zero by adaptively sampling Fourier frequencies, formalized through . Experimental results on function reconstruction show ARFF achieving spectral unbiasedness, while SGD remains spectrally biased. In MNIST (and CIFAR-10) experiments, ARFF-based models exhibit enhanced robustness to sparse additive perturbations, particularly when using noisy validation data with early stopping, highlighting a practical path to improved reliability via frequency-aware training.

Abstract

We present experimental results highlighting two key differences resulting from the choice of training algorithm for two-layer neural networks. The spectral bias of neural networks is well known, while the spectral bias dependence on the choice of training algorithm is less studied. Our experiments demonstrate that an adaptive random Fourier features algorithm (ARFF) can yield a spectral bias closer to zero compared to the stochastic gradient descent optimizer (SGD). Additionally, we train two identically structured classifiers, employing SGD and ARFF, to the same accuracy levels and empirically assess their robustness against adversarial noise attacks.
Paper Structure (5 sections, 2 equations, 2 figures)

This paper contains 5 sections, 2 equations, 2 figures.

Figures (2)

  • Figure 1: Spectral bias comparison, computed after each epoch, of a neural network trained with SGD and ARFF by using Method 1 from kiessling2022computable.
  • Figure 2: Left, Experiment 1: The figure shows the accuracy on the MNIST test data after a sparse black box attack. Middle, Experiment 2: Accuracy on the MNIST test data after a black box attack for different noise levels. Right, Experiment 3: Accuracy on the MNIST test dataset after a black box noise attack for classifiers tuned, with the help of noisy validation data, to withstand noise attacks. In Experiment 3 the stopping criteria depends on the noise level $\sigma$.