Comparing Spectral Bias and Robustness For Two-Layer Neural Networks: SGD vs Adaptive Random Fourier Features

Aku Kammonen; Lisi Liang; Anamika Pandey; Raúl Tempone

Comparing Spectral Bias and Robustness For Two-Layer Neural Networks: SGD vs Adaptive Random Fourier Features

Aku Kammonen, Lisi Liang, Anamika Pandey, Raúl Tempone

TL;DR

This paper investigates how the choice of training algorithm affects spectral bias and robustness in a two-layer neural network. It contrasts SGD with adaptive random Fourier features (ARFF) and demonstrates that ARFF can reduce spectral bias toward zero by adaptively sampling Fourier frequencies, formalized through $SB = (\mathcal{E}_{high}-\mathcal{E}_{low})/(\mathcal{E}_{high}+\mathcal{E}_{low})$. Experimental results on function reconstruction show ARFF achieving spectral unbiasedness, while SGD remains spectrally biased. In MNIST (and CIFAR-10) experiments, ARFF-based models exhibit enhanced robustness to sparse additive perturbations, particularly when using noisy validation data with early stopping, highlighting a practical path to improved reliability via frequency-aware training.

Abstract

We present experimental results highlighting two key differences resulting from the choice of training algorithm for two-layer neural networks. The spectral bias of neural networks is well known, while the spectral bias dependence on the choice of training algorithm is less studied. Our experiments demonstrate that an adaptive random Fourier features algorithm (ARFF) can yield a spectral bias closer to zero compared to the stochastic gradient descent optimizer (SGD). Additionally, we train two identically structured classifiers, employing SGD and ARFF, to the same accuracy levels and empirically assess their robustness against adversarial noise attacks.

Comparing Spectral Bias and Robustness For Two-Layer Neural Networks: SGD vs Adaptive Random Fourier Features

TL;DR

. Experimental results on function reconstruction show ARFF achieving spectral unbiasedness, while SGD remains spectrally biased. In MNIST (and CIFAR-10) experiments, ARFF-based models exhibit enhanced robustness to sparse additive perturbations, particularly when using noisy validation data with early stopping, highlighting a practical path to improved reliability via frequency-aware training.

Abstract

Paper Structure (5 sections, 2 equations, 2 figures)

This paper contains 5 sections, 2 equations, 2 figures.

Introduction
Spectral bias
Robustness to an additive adversarial attack
Conclusion
Acknowledgement

Figures (2)

Figure 1: Spectral bias comparison, computed after each epoch, of a neural network trained with SGD and ARFF by using Method 1 from kiessling2022computable.
Figure 2: Left, Experiment 1: The figure shows the accuracy on the MNIST test data after a sparse black box attack. Middle, Experiment 2: Accuracy on the MNIST test data after a black box attack for different noise levels. Right, Experiment 3: Accuracy on the MNIST test dataset after a black box noise attack for classifiers tuned, with the help of noisy validation data, to withstand noise attacks. In Experiment 3 the stopping criteria depends on the noise level $\sigma$.

Comparing Spectral Bias and Robustness For Two-Layer Neural Networks: SGD vs Adaptive Random Fourier Features

TL;DR

Abstract

Comparing Spectral Bias and Robustness For Two-Layer Neural Networks: SGD vs Adaptive Random Fourier Features

Authors

TL;DR

Abstract

Table of Contents

Figures (2)