Frequency and Generalisation of Periodic Activation Functions in Reinforcement Learning
Augustine N. Mavor-Parker, Matthew J. Sargent, Caswell Barry, Lewis Griffin, Clare Lyle
TL;DR
This work empirically analyzes learned Fourier features in off-policy reinforcement learning, focusing on whether improvements arise from high-frequency expressivity or low-frequency generalization. By comparing LFF and CLFF within SAC on the DeepMind Control Suite, the authors show that periodic representations consistently converge to high frequencies largely independent of initialization, and that their generalization benefits erode under input noise due to increased brittleness and higher effective rank. Weight decay is proposed as a practical regularizer that partially offsets overfitting and maintains faster learning while improving robustness. The findings suggest a trade-off between expressiveness and generalization, motivating adaptive architectures that can modulate frequency according to state novelty or perturbations.
Abstract
Periodic activation functions, often referred to as learned Fourier features have been widely demonstrated to improve sample efficiency and stability in a variety of deep RL algorithms. Potentially incompatible hypotheses have been made about the source of these improvements. One is that periodic activations learn low frequency representations and as a result avoid overfitting to bootstrapped targets. Another is that periodic activations learn high frequency representations that are more expressive, allowing networks to quickly fit complex value functions. We analyse these claims empirically, finding that periodic representations consistently converge to high frequencies regardless of their initialisation frequency. We also find that while periodic activation functions improve sample efficiency, they exhibit worse generalization on states with added observation noise -- especially when compared to otherwise equivalent networks with ReLU activation functions. Finally, we show that weight decay regularization is able to partially offset the overfitting of periodic activation functions, delivering value functions that learn quickly while also generalizing.
