How Learnable Grids Recover Fine Detail in Low Dimensions: A Neural Tangent Kernel Analysis of Multigrid Parametric Encodings
Samuel Audia, Soheil Feizi, Matthias Zwicker, Dinesh Manocha
TL;DR
The paper tackles spectral bias in coordinate-based neural networks by comparing Fourier feature encodings (FFE) and multigrid parametric encodings (MPE) through the neural tangent kernel (NTK) lens. It derives a finite-width NTK for MPEs, proving a lower-bound increase in the eigenvalue spectrum that arises from the learnable grid, not just the embedding space, and contrasts this with FFEs whose gains stem solely from embedding. Empirically, the authors demonstrate substantial improvements in high-frequency detail learning on 2D image regression (ImageNet synonym sets) and 3D implicit surface regression (Stanford meshes), with the MPE achieving markedly higher NTK spectra and better PSNR/MS-SSIM scores. The findings provide theoretical and practical justification for using grid-based encodings to mitigate spectral bias, with broad implications for graphics and scientific ML tasks. The work also outlines limitations and directions for future research, including exploring activation-function effects and optimizing interpolation kernels for domain-specific performance.
Abstract
Neural networks that map between low dimensional spaces are ubiquitous in computer graphics and scientific computing; however, in their naive implementation, they are unable to learn high frequency information. We present a comprehensive analysis comparing the two most common techniques for mitigating this spectral bias: Fourier feature encodings (FFE) and multigrid parametric encodings (MPE). FFEs are seen as the standard for low dimensional mappings, but MPEs often outperform them and learn representations with higher resolution and finer detail. FFE's roots in the Fourier transform, make it susceptible to aliasing if pushed too far, while MPEs, which use a learned grid structure, have no such limitation. To understand the difference in performance, we use the neural tangent kernel (NTK) to evaluate these encodings through the lens of an analogous kernel regression. By finding a lower bound on the smallest eigenvalue of the NTK, we prove that MPEs improve a network's performance through the structure of their grid and not their learnable embedding. This mechanism is fundamentally different from FFEs, which rely solely on their embedding space to improve performance. Results are empirically validated on a 2D image regression task using images taken from 100 synonym sets of ImageNet and 3D implicit surface regression on objects from the Stanford graphics dataset. Using peak signal-to-noise ratio (PSNR) and multiscale structural similarity (MS-SSIM) to evaluate how well fine details are learned, we show that the MPE increases the minimum eigenvalue by 8 orders of magnitude over the baseline and 2 orders of magnitude over the FFE. The increase in spectrum corresponds to a 15 dB (PSNR) / 0.65 (MS-SSIM) increase over baseline and a 12 dB (PSNR) / 0.33 (MS-SSIM) increase over the FFE.
