Analyzing the Neural Tangent Kernel of Periodically Activated Coordinate Networks
Hemanth Saratchandran, Shin-Fang Chng, Simon Lucey
TL;DR
This paper addresses why periodically activated networks, particularly cosine-activated coordinate networks, can outperform ReLU nets by analyzing their Neural Tangent Kernel (NTK). It derives two-sided bounds on the minimum eigenvalue of the empirical NTK $\lambda_{\min}(K_L)$ in a finite-width regime with a single wide hidden layer, showing a $\Theta(n_k^{3/2})$ scaling that yields a larger spectral gap than ReLU activations. The authors also prove a memorization capacity theorem under similar width growth conditions and provide empirical evidence that supports the theory, including comparisons with ReLU and measurements of the empirical Lipschitz constant. Collectively, the work advances understanding of how periodic activations influence training dynamics and memorization in coordinate networks and highlights potential practical benefits for implicit neural representations.
Abstract
Recently, neural networks utilizing periodic activation functions have been proven to demonstrate superior performance in vision tasks compared to traditional ReLU-activated networks. However, there is still a limited understanding of the underlying reasons for this improved performance. In this paper, we aim to address this gap by providing a theoretical understanding of periodically activated networks through an analysis of their Neural Tangent Kernel (NTK). We derive bounds on the minimum eigenvalue of their NTK in the finite width setting, using a fairly general network architecture which requires only one wide layer that grows at least linearly with the number of data samples. Our findings indicate that periodically activated networks are \textit{notably more well-behaved}, from the NTK perspective, than ReLU activated networks. Additionally, we give an application to the memorization capacity of such networks and verify our theoretical predictions empirically. Our study offers a deeper understanding of the properties of periodically activated neural networks and their potential in the field of deep learning.
