Adaptive Random Fourier Features Training Stabilized By Resampling With Applications in Image Regression

Aku Kammonen; Anamika Pandey; Erik von Schwerin; Raúl Tempone

Adaptive Random Fourier Features Training Stabilized By Resampling With Applications in Image Regression

Aku Kammonen, Anamika Pandey, Erik von Schwerin, Raúl Tempone

TL;DR

The paper addresses instability and hyperparameter sensitivity in Adaptive Random Fourier Features (ARFF) by introducing a particle-filter–style resampling mechanism. This resampling yields a stabilized training dynamic and allows Metropolis-free operation, enabling both standalone training and pretraining for gradient-based optimization. The authors demonstrate the approach on function regression and image regression by adaptively sampling RFF frequencies for the RFF layer in coordinate-based MLPs, achieving faster early convergence and improved robustness. Collectively, the work offers a practical method to automate RFF frequency selection in scalable shallow networks and RFF-enabled MLPs used for high-frequency image representation.

Abstract

This paper presents an enhanced adaptive random Fourier features (ARFF) training algorithm for shallow neural networks, building upon the work introduced in "Adaptive Random Fourier Features with Metropolis Sampling", Kammonen et al., \emph{Foundations of Data Science}, 2(3):309--332, 2020. This improved method uses a particle filter-type resampling technique to stabilize the training process and reduce the sensitivity to parameter choices. The Metropolis test can also be omitted when resampling is used, reducing the number of hyperparameters by one and reducing the computational cost per iteration compared to the ARFF method. We present comprehensive numerical experiments demonstrating the efficacy of the proposed algorithm in function regression tasks as a stand-alone method and as a pretraining step before gradient-based optimization, using the Adam optimizer. Furthermore, we apply the proposed algorithm to a simple image regression problem, illustrating its utility in sampling frequencies for the random Fourier features (RFF) layer of coordinate-based multilayer perceptrons. In this context, we use the proposed algorithm to sample the parameters of the RFF layer in an automated manner.

Adaptive Random Fourier Features Training Stabilized By Resampling With Applications in Image Regression

TL;DR

Abstract

Paper Structure (16 sections, 17 equations, 20 figures, 9 tables, 1 algorithm)

This paper contains 16 sections, 17 equations, 20 figures, 9 tables, 1 algorithm.

Introduction
Adaptive Random Features With Resampling
Problem Statement and Adaptive Random Fourier Features
Training Algorithm With Resampling
Numerical Experiments
Regularized Discontinuity in Four Dimensions
Algorithm \ref{['alg:AMRS_E']} as a Stand-alone Training Algorithm
Test 1: Statistics over 100 independent realizations
Test 2: The full data-set in each iteration with $\gamma=10$ in the Metropolis tests.
Test 3: Full data-set in each iteration, comparing $\gamma=1$ and $\gamma=10$.
Test 4: Using a batch size of $M_B<M$.
Test 5: Non-degenerate initial distribution of frequencies.
Algorithm \ref{['alg:AMRS_E']} as a Pretraining Algorithm
Test 6: Accelerating the Adam optimizer by pretraining with Algorithm \ref{['alg:AMRS_E']}
Coordinate-based Multilayer Perceptrons
...and 1 more sections

Figures (20)

Figure 1: Test \ref{['test:statistics']} (i.e., \ref{['eq:reg_disc_data_set']} with $B$ in \ref{['eq:rot_mat']} and parameters in Table \ref{['tab:Sigint_statistics']}) based on 100 independent realizations of the stochastic algorithms for each $K$. Top row: Convergence of the minimal training and testing errors w.r.t. the number of nodes, $K$. Sample means with the error bars that indicating a confidence interval of $\pm 2$ sample standard deviations. The $\lambda\to 0$ limit of the error estimate \ref{['eq:error_bound']} is included for reference. Bottom row: Errors for $K=512$ as a function of the number of iterations. Sample means and sample means $\pm 2$ sample standard deviations.
Figure 2: Test \ref{['test:all_data']} (i.e., \ref{['eq:reg_disc_data_set']} with $B$ in \ref{['eq:rot_mat']} and parameters in Table \ref{['tab:Sigint_full']}) with one realization of the stochastic algorithms for each $K$. Top row: Convergence of the minimal training and testing errors w.r.t. the number of nodes, $K$. The $\lambda\to 0$ limit of the error estimate \ref{['eq:error_bound']} is included for reference. Middle row: Errors for $K=1024$ as a function of the number of iterations. Bottom row: Normalized effective sample size, $K_\mathrm{ESS}/K$, with $K_\mathrm{ESS}$ defined in \ref{['eq:ESS']}, for $K=1024$ as a function of the number of iterations.
Figure 3: Test \ref{['test:effect_gamma']} (i.e., \ref{['eq:reg_disc_data_set']} with $B$ in \ref{['eq:rot_mat']}) illustrating sensitivity w.r.t. $\gamma$. One realization of the stochastic algorithms is included. Left column: The case $\gamma=1$, with parameters in Table \ref{['tab:Sigint_gamma_1']}. Right column: The case $\gamma=10$, with parameters in Table \ref{['tab:Sigint_full']} for $K=256$.
Figure 4: Test \ref{['test:effect_batch']} (i.e., \ref{['eq:reg_disc_data_set']} with $B$ in \ref{['eq:rot_mat']}) illustrating sensitivity w.r.t. $M_B$. One realization of the stochastic algorithms is included. Parameters are the same as in the case of $K=512$ in Table \ref{['tab:Sigint_full']}, except $M_B$, which is $M_B=10^3$, (top), $M_B=10^4$, (middle), and $M_B=M$, (bottom).
Figure 5: Test \ref{['test:effect_batch']} (i.e., \ref{['eq:reg_disc_data_set']} with $B$ in \ref{['eq:rot_mat']}) illustrating the effective sample sizes corresponding to Figure \ref{['fig:reduced_batch_size']}. One realization of the stochastic algorithms is included. Parameters are the same as in the case of $K=512$ in Table \ref{['tab:Sigint_full']}, except $M_B$, which is $M_B=10^3$, (top), $M_B=10^4$, (middle), and $M_B=M$, (bottom).
...and 15 more figures

Adaptive Random Fourier Features Training Stabilized By Resampling With Applications in Image Regression

TL;DR

Abstract

Adaptive Random Fourier Features Training Stabilized By Resampling With Applications in Image Regression

Authors

TL;DR

Abstract

Table of Contents

Figures (20)