Grounding and Enhancing Grid-based Models for Neural Fields

Zelin Zhao; Fenglei Fan; Wenlong Liao; Junchi Yan

Grounding and Enhancing Grid-based Models for Neural Fields

Zelin Zhao, Fenglei Fan, Wenlong Liao, Junchi Yan

TL;DR

This work addresses the lack of principled theory for grid-based neural fields by introducing Grid Tangent Kernel ($GTK$) theory to connect grid architectures to training dynamics and generalization. It then proposes MulFAGrid, an adaptive grid-based model that fuses multiplicative Fourier features with kernel learning and node-wise normalization, compatible with both regular and irregular grids. The GTK framework enables a rigorous analysis showing GTK invariance during training and provides a Rademacher-complexity–based generalization bound, guiding design choices. Empirically, MulFAGrid achieves state-of-the-art performance across 2D image fitting, 3D SDF reconstruction, and NeRF-based novel view synthesis, supported by ablations that validate the importance of learned kernels and Fourier features. Overall, the paper offers a principled toolkit for designing efficient, expressive grid-based neural fields with strong generalization and practical impact.

Abstract

Many contemporary studies utilize grid-based models for neural field representation, but a systematic analysis of grid-based models is still missing, hindering the improvement of those models. Therefore, this paper introduces a theoretical framework for grid-based models. This framework points out that these models' approximation and generalization behaviors are determined by grid tangent kernels (GTK), which are intrinsic properties of grid-based models. The proposed framework facilitates a consistent and systematic analysis of diverse grid-based models. Furthermore, the introduced framework motivates the development of a novel grid-based model named the Multiplicative Fourier Adaptive Grid (MulFAGrid). The numerical analysis demonstrates that MulFAGrid exhibits a lower generalization bound than its predecessors, indicating its robust generalization performance. Empirical studies reveal that MulFAGrid achieves state-of-the-art performance in various tasks, including 2D image fitting, 3D signed distance field (SDF) reconstruction, and novel view synthesis, demonstrating superior representation ability. The project website is available at https://sites.google.com/view/cvpr24-2034-submission/home.

Grounding and Enhancing Grid-based Models for Neural Fields

TL;DR

This work addresses the lack of principled theory for grid-based neural fields by introducing Grid Tangent Kernel (

) theory to connect grid architectures to training dynamics and generalization. It then proposes MulFAGrid, an adaptive grid-based model that fuses multiplicative Fourier features with kernel learning and node-wise normalization, compatible with both regular and irregular grids. The GTK framework enables a rigorous analysis showing GTK invariance during training and provides a Rademacher-complexity–based generalization bound, guiding design choices. Empirically, MulFAGrid achieves state-of-the-art performance across 2D image fitting, 3D SDF reconstruction, and NeRF-based novel view synthesis, supported by ablations that validate the importance of learned kernels and Fourier features. Overall, the paper offers a principled toolkit for designing efficient, expressive grid-based neural fields with strong generalization and practical impact.

Abstract

Paper Structure (27 sections, 4 theorems, 52 equations, 6 figures, 6 tables)

This paper contains 27 sections, 4 theorems, 52 equations, 6 figures, 6 tables.

Introduction
Related work
Methodology
Understanding grid-based models
Formulations
The grid tangent kernel (GTK) theory
MulFAGrid
Experimental results
Numerical study based on the GTK
2D image fitting
3D signed distance fields reconstruction
Novel view synthesis
Ablation studies
Conclusion
Acknowledgements
...and 12 more sections

Key Result

Theorem 1

Let $\boldsymbol{O}(t)=(g(\boldsymbol{X}_i, \boldsymbol{w}(t)))_{1\leq i \leq n}$ be the outputs of a grid-based model $g$ where $\boldsymbol{X}=(\boldsymbol{X}_i)_{1\leq i \leq n}$ is the input data at time $t$, and $\boldsymbol{Y}=(\boldsymbol{Y}_i)_{1\leq i \leq n}$ is the corresponding label. Th

Figures (6)

Figure 1: Formulations for grid-based models. (A) A grid-based model takes a query coordinate $\boldsymbol{x}$ as the input, which is sent to an index function $U$ to acquire a set of feature vectors $\boldsymbol{w}$ from the grid. Then, the model outputs a weighted average of the kernel function $\varphi$ and the feature vectors $\boldsymbol{w}$. (B) Our formulation supports grid-based models using a regular grid or an irregular grid, depending on the index function. The query coordinate is shown in green, and the queried points are in red. Please refer to \ref{['sec-grid-model-definition']} for more details.
Figure 2: (Left) The diagram of MulFAGrid. The input query coordinate is passed to the multiplicative filter to produce Fourier features and then sent to the normalization layer to compute the aggregation weights. See \ref{['sec-mulfagrid-model']} for details. (Right) The full architecture for neural radiance fields (NeRF). We obtain the densities $\boldsymbol{\sigma}$ via a MulFAGrid and the activation $\phi_{sp}$. For the colors $\boldsymbol{c}$, we encode the position $\boldsymbol{x}$ via the MulFAGrid with an MLP to post-process the features. After that, we combine the queried spatial features with ray direction information to get color predictions. Please refer to \ref{['sec-nerf-results']} for detailed explanations.
Figure 3: Analysis of grid-based models (InstantNGP InstantNGP, NFFB NFFB, NeuRBF neuRBF, and ours) based on grid tangent kernels (GTKs) and image regression results. (Top) Visualizations of the GTK Fourier spectrum. MulFAGrid has a wide spectrum, especially in the high-frequency domain, leading to faster convergence for high-frequency components fourierFFN. (Mid) Comparisons between generalization bounds of pairs of methods. In this experiment, we construct a dataset, which only contains two data points with labels $\boldsymbol{Y}=(\boldsymbol{Y}_1, \boldsymbol{Y}_2)$, shown in the x-axis and y-axis correspondingly. MulFAGrid has a tighter (lower) generalization bound for most values of $\boldsymbol{Y}_1$ and $\boldsymbol{Y}_2$. These findings help explain why MulFAGrid demonstrates better representation ability than other grid-based models. (Bot) Error maps of the fitted images in comparison with ground truth ones.
Figure 4: Comparison curves of several grid-based models: InstantNGP InstantNGP, NFFB NFFB, NeuRBF neuRBF, and MulFAGrid. (Left) Training curves of the image regression task on the Kodak dataset franzen1999kodak. (Right) The evolution of the normal angular error (NAE) through training of the 3D SDF reconstruction task neuRBF.
Figure 5: Rendering results on various scenes from SyntheticNeRF nerf, Tanks&Temples tanksAndTemples and Mip-NeRF-360 mipnerf360. We visualize comparison results against some grid-based models NFFB NFFB, NeuRBF neuRBF, and the strong baseline 3DGS 3DGS based on point cloud initialized from structure-from-motion (SfM).
...and 1 more figures

Theorems & Definitions (9)

Definition 1
Definition 2
Theorem 1
Theorem 2
Theorem 3
proof
proof
proof
Theorem 4

Grounding and Enhancing Grid-based Models for Neural Fields

TL;DR

Abstract

Grounding and Enhancing Grid-based Models for Neural Fields

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (6)

Theorems & Definitions (9)