A Special Case of Quadratic Extrapolation Under the Neural Tangent Kernel
Abiel Kim
TL;DR
This work analyzes extrapolation by over-parameterized ReLU MLPs in the NTK regime specifically at the origin, where the NTK feature map is non-translation-invariant. By constructing a shifted training set $oldsymbol{x}_i^ ext{infty}=oldsymbol{x}_i-toldsymbol{v}_oldsymbol{ ho}$ and letting $t o o ext{infty}$ while evaluating at $oldsymbol{0}$, the authors derive an explicit asymptotic form of the NTK Gram under Tikhonov regularization and prove that the resulting predictor behaves as a quadratic extrapolator near the origin. Theoretical results show that the first and second directional derivatives exist, higher-order derivatives vanish, and the second derivative depends on the alignment between feature directions and the evaluation direction; orthogonality can force it to zero. Overall, the paper identifies a canonical nonlinear extrapolation regime in NTK-based learning, complementing prior linear-extrapolation results and highlighting the role of NTK geometry in extrapolation behavior.
Abstract
It has been demonstrated both theoretically and empirically that the ReLU MLP tends to extrapolate linearly for an out-of-distribution evaluation point. The machine learning literature provides ample analysis with respect to the mechanisms to which linearity is induced. However, the analysis of extrapolation at the origin under the NTK regime remains a more unexplored special case. In particular, the infinite-dimensional feature map induced by the neural tangent kernel is not translationally invariant. This means that the study of an out-of-distribution evaluation point very far from the origin is not equivalent to the evaluation of a point very near the origin. And since the feature map is rotation invariant, these two special cases may represent the most canonically extreme bounds of ReLU NTK extrapolation. Ultimately, it is this loose recognition of the two special cases of extrapolation that motivate the discovery of quadratic extrapolation for an evaluation close to the origin.
