Table of Contents
Fetching ...

TaylorGrid: Towards Fast and High-Quality Implicit Field Learning via Direct Taylor-based Grid Optimization

Renyi Mao, Qingshan Xu, Peng Zheng, Ye Wang, Tieru Wu, Rui Ma

TL;DR

TaylorGrid introduces a grid-based implicit-field representation that embeds low-order Taylor expansion coefficients directly on a dense grid to enable fast and high-quality learning for both SDF geometry reconstruction and Neural Radiance Fields. By querying a point x through the Taylor-expanded neighborhoods of eight surrounding grid vertices and then converting to a linear voxel for tri-linear interpolation, the method achieves a favorable balance between the speed of linear grids and the expressivity of neural voxels, without requiring heavy MLPs. Empirical results on 3D geometry and NeRF tasks show faster convergence and competitive or superior quality (e.g., CD, IoU, PSNR, LPIPS) with modest memory requirements, and ablations indicate that a second-order Taylor expansion often suffices. The approach is versatile and can be integrated into existing implicit-field frameworks, offering a practical path toward fast and high-quality implicit field learning, while acknowledging memory and scalability considerations for very high-resolution grids and potential extensions with sparse data structures.

Abstract

Coordinate-based neural implicit representation or implicit fields have been widely studied for 3D geometry representation or novel view synthesis. Recently, a series of efforts have been devoted to accelerating the speed and improving the quality of the coordinate-based implicit field learning. Instead of learning heavy MLPs to predict the neural implicit values for the query coordinates, neural voxels or grids combined with shallow MLPs have been proposed to achieve high-quality implicit field learning with reduced optimization time. On the other hand, lightweight field representations such as linear grid have been proposed to further improve the learning speed. In this paper, we aim for both fast and high-quality implicit field learning, and propose TaylorGrid, a novel implicit field representation which can be efficiently computed via direct Taylor expansion optimization on 2D or 3D grids. As a general representation, TaylorGrid can be adapted to different implicit fields learning tasks such as SDF learning or NeRF. From extensive quantitative and qualitative comparisons, TaylorGrid achieves a balance between the linear grid and neural voxels, showing its superiority in fast and high-quality implicit field learning.

TaylorGrid: Towards Fast and High-Quality Implicit Field Learning via Direct Taylor-based Grid Optimization

TL;DR

TaylorGrid introduces a grid-based implicit-field representation that embeds low-order Taylor expansion coefficients directly on a dense grid to enable fast and high-quality learning for both SDF geometry reconstruction and Neural Radiance Fields. By querying a point x through the Taylor-expanded neighborhoods of eight surrounding grid vertices and then converting to a linear voxel for tri-linear interpolation, the method achieves a favorable balance between the speed of linear grids and the expressivity of neural voxels, without requiring heavy MLPs. Empirical results on 3D geometry and NeRF tasks show faster convergence and competitive or superior quality (e.g., CD, IoU, PSNR, LPIPS) with modest memory requirements, and ablations indicate that a second-order Taylor expansion often suffices. The approach is versatile and can be integrated into existing implicit-field frameworks, offering a practical path toward fast and high-quality implicit field learning, while acknowledging memory and scalability considerations for very high-resolution grids and potential extensions with sparse data structures.

Abstract

Coordinate-based neural implicit representation or implicit fields have been widely studied for 3D geometry representation or novel view synthesis. Recently, a series of efforts have been devoted to accelerating the speed and improving the quality of the coordinate-based implicit field learning. Instead of learning heavy MLPs to predict the neural implicit values for the query coordinates, neural voxels or grids combined with shallow MLPs have been proposed to achieve high-quality implicit field learning with reduced optimization time. On the other hand, lightweight field representations such as linear grid have been proposed to further improve the learning speed. In this paper, we aim for both fast and high-quality implicit field learning, and propose TaylorGrid, a novel implicit field representation which can be efficiently computed via direct Taylor expansion optimization on 2D or 3D grids. As a general representation, TaylorGrid can be adapted to different implicit fields learning tasks such as SDF learning or NeRF. From extensive quantitative and qualitative comparisons, TaylorGrid achieves a balance between the linear grid and neural voxels, showing its superiority in fast and high-quality implicit field learning.
Paper Structure (13 sections, 11 equations, 7 figures, 4 tables)

This paper contains 13 sections, 11 equations, 7 figures, 4 tables.

Figures (7)

  • Figure 1: Linear gird methods store a signal $d\left(\textcolor{cyan}{v_i}\right)$ at each grid vertex, and get the query result with tri-linear interpolation. SMLP methods store features $z\left(\textcolor{cyan}{v_i}\right)$ at grid vertices, and first get the feature of a queried point, $z\left(\textcolor{orange}{q}\right)$, by tri-linear interpolation. Then, SMLP methods decode the feature into an implicit value with a shallow MLP. We use Taylor expansion series to fit the signals of an implicit field. Our method is simple yet effective, and it stores a group of Taylor expansion coefficients at each grid vertex, which can be used to approximate the implicit value with Taylor polynomial formulation. Then, we combine the approximated values computed with all neighboring grid vertices with tri-linear interpolation. Our method is efficient and effective, with a convergence rate comparable to linear grid methods and a quality comparable to shallow MLP methods.
  • Figure 2: We use the Kuzushiji-MNIST kMNIST handwritten dataset to reconstruct image-based signed distance fields (SDF), and demonstrate the reconstruction results of different methods at different grid resolutions. The $1/64$ represents the grid resolution is 64 times smaller than the original image while $1/32$ represents the grid resolution is 32 times smaller than than the original image. Each column shows the SDF reconstruction result of images, as well as the ground truth images on the rightmost column. We use methods such as linear interpolation, Taylor grid, neural voxel combined with a small MLP.
  • Figure 3: Qualitative results for the geometry reconstruction. Comparisons are made with DeepSDF, linear grid, shallow MLP, and Taylor. For Taylor, a grid of $128^3$ resolution is used.
  • Figure 4: The chart shown the IoU curve over training time of Linear, SMLP, $1^{st}$ Taylor and Taylor. We let all of methods have the same training epochs, the gray dash vertical lines indicate the end of training. For the same training epochs, our method, especially $1^{st}$ Taylor, can finish training more quickly. Our method can convergence as fast as Linear, and has a good quality as SMLP.
  • Figure 5: Qualitative results for ablation study on geometry reconstruction. Comparisons are made to evaluate the effects for resolution with 64 and 128, shallow MLP, loss of $L_{TV}$ and first order Taylor. Among the results, the full Taylor$_{128}$ model achieves the best quality.
  • ...and 2 more figures