A Unified Kernel for Neural Network Learning
Shao-Qun Zhang, Zong-Yi Chen, Yong-Ming Tian, Xun Lu
TL;DR
This work introduces the Unified Neural Kernel (UNK), a kernel induced by the inner product of produced variables and governed by gradient-descent dynamics with a multiplier $\lambda$ on the initial parameters, designed to bridge Neural Network Gaussian Processes (NNGP) and Neural Tangent Kernels (NTK). The UNK recovers the NTK in the $\lambda=0$ or $t=0$ limits and converges to the NNGP as $t\to\infty$ when $\lambda\neq 0$, providing a unified view of neural-kernel learning. The authors establish the existence, limiting behavior, convergence, and uniform tightness of the UNK and present explicit examples (including an $L_2$-regularizer case and a $t'$-based updating scheme) with supporting corollaries; they also validate the approach with MNIST-like experiments, showing improvements in pre-trained model fine-tuning. Overall, the work offers a cohesive theoretical framework for neural-kernel learning with practical implications for kernel-based inference and transfer learning.
Abstract
Past decades have witnessed a great interest in the distinction and connection between neural network learning and kernel learning. Recent advancements have made theoretical progress in connecting infinite-wide neural networks and Gaussian processes. Two predominant approaches have emerged: the Neural Network Gaussian Process (NNGP) and the Neural Tangent Kernel (NTK). The former, rooted in Bayesian inference, represents a zero-order kernel, while the latter, grounded in the tangent space of gradient descents, is a first-order kernel. In this paper, we present the Unified Neural Kernel (UNK), which {is induced by the inner product of produced variables and characterizes the learning dynamics of neural networks with gradient descents and parameter initialization.} The proposed UNK kernel maintains the limiting properties of both NNGP and NTK, exhibiting behaviors akin to NTK with a finite learning step and converging to NNGP as the learning step approaches infinity. Besides, we also theoretically characterize the uniform tightness and learning convergence of the UNK kernel, providing comprehensive insights into this unified kernel. Experimental results underscore the effectiveness of our proposed method.
