Table of Contents
Fetching ...

NeuralVDB: High-resolution Sparse Volume Representation using Hierarchical Neural Networks

Doyub Kim, Minjae Lee, Ken Museth

TL;DR

NeuralVDB addresses the growing memory burden of sparse volumetric data by hybridizing the proven VDB structure with neural networks that encode both topology and values. By replacing the two lowest levels of the VDB tree with neural classifiers and regressors and using domain decomposition with Fourier-feature mappings, NeuralVDB achieves compression factors up to $\sim$ $60$–$140$ with high fidelity, while preserving compatibility with existing VDB pipelines. The authors demonstrate strong compression gains across SDF and density volumes, outperforming several neural baselines (NGLOD, VBNF, INGP) on key metrics and enabling temporally coherent reconstructions via a warm-start encoder. While not a drop-in replacement for all workflows, NeuralVDB offers a practical offline/streaming solution for large sparse volumes, balancing memory reduction, reconstruction quality, and access patterns in real-world scenes.

Abstract

We introduce NeuralVDB, which improves on an existing industry standard for efficient storage of sparse volumetric data, denoted VDB [Museth 2013], by leveraging recent advancements in machine learning. Our novel hybrid data structure can reduce the memory footprints of VDB volumes by orders of magnitude, while maintaining its flexibility and only incurring small (user-controlled) compression errors. Specifically, NeuralVDB replaces the lower nodes of a shallow and wide VDB tree structure with multiple hierarchical neural networks that separately encode topology and value information by means of neural classifiers and regressors respectively. This approach is proven to maximize the compression ratio while maintaining the spatial adaptivity offered by the higher-level VDB data structure. For sparse signed distance fields and density volumes, we have observed compression ratios on the order of 10x to more than 100x from already compressed VDB inputs, with little to no visual artifacts. Furthermore, NeuralVDB is shown to offer more effective compression performance compared to other neural representations such as Neural Geometric Level of Detail [Takikawa et al. 2021], Variable Bitrate Neural Fields [Takikawa et al. 2022a], and Instant Neural Graphics Primitives [Müller et al. 2022]. Finally, we demonstrate how warm-starting from previous frames can accelerate training, i.e., compression, of animated volumes as well as improve temporal coherency of model inference, i.e., decompression.

NeuralVDB: High-resolution Sparse Volume Representation using Hierarchical Neural Networks

TL;DR

NeuralVDB addresses the growing memory burden of sparse volumetric data by hybridizing the proven VDB structure with neural networks that encode both topology and values. By replacing the two lowest levels of the VDB tree with neural classifiers and regressors and using domain decomposition with Fourier-feature mappings, NeuralVDB achieves compression factors up to with high fidelity, while preserving compatibility with existing VDB pipelines. The authors demonstrate strong compression gains across SDF and density volumes, outperforming several neural baselines (NGLOD, VBNF, INGP) on key metrics and enabling temporally coherent reconstructions via a warm-start encoder. While not a drop-in replacement for all workflows, NeuralVDB offers a practical offline/streaming solution for large sparse volumes, balancing memory reduction, reconstruction quality, and access patterns in real-world scenes.

Abstract

We introduce NeuralVDB, which improves on an existing industry standard for efficient storage of sparse volumetric data, denoted VDB [Museth 2013], by leveraging recent advancements in machine learning. Our novel hybrid data structure can reduce the memory footprints of VDB volumes by orders of magnitude, while maintaining its flexibility and only incurring small (user-controlled) compression errors. Specifically, NeuralVDB replaces the lower nodes of a shallow and wide VDB tree structure with multiple hierarchical neural networks that separately encode topology and value information by means of neural classifiers and regressors respectively. This approach is proven to maximize the compression ratio while maintaining the spatial adaptivity offered by the higher-level VDB data structure. For sparse signed distance fields and density volumes, we have observed compression ratios on the order of 10x to more than 100x from already compressed VDB inputs, with little to no visual artifacts. Furthermore, NeuralVDB is shown to offer more effective compression performance compared to other neural representations such as Neural Geometric Level of Detail [Takikawa et al. 2021], Variable Bitrate Neural Fields [Takikawa et al. 2022a], and Instant Neural Graphics Primitives [Müller et al. 2022]. Finally, we demonstrate how warm-starting from previous frames can accelerate training, i.e., compression, of animated volumes as well as improve temporal coherency of model inference, i.e., decompression.
Paper Structure (35 sections, 4 equations, 17 figures, 8 tables)

This paper contains 35 sections, 4 equations, 17 figures, 8 tables.

Figures (17)

  • Figure 1: 1D and 2D illustrations of VDB data structures. Left: A 1D 4-level VDB tree hierarchy is shown with its various node structures and bitmasks. The top-most root node (level 3) holds an unbounded set of internal nodes (level 2), and the red/blue internal nodes encode tile values or child pointers using bitmasks ($a_l$ and $c_l$). The lower green leaf nodes store voxel values $f$ and their active masks $a_0$. Right: 2D illustration of the hierarchical tree nodes that intersect the sparse (gray) pixels. The color schemes are shared between the 1D and 2D illustrations. The number of nodes per level are indicated where the level 2 and 1 internal nodes have $32^3$ and $16^3$ children nodes and level 0 (leaf) nodes have $8^3$ voxels per node.
  • Figure 2: Illustration of two different NeuralVDB structures: a) a standard VDB tree with neural voxel values ($[\textrm{Hash},5,4,\textrm{NN}(3)]$ using the VDB tree notation), and b) a hybrid VDB/neural tree with neural representations of both nodes and their values ($[\textrm{Hash},5,\textrm{NN}(4),\textrm{NN}(3)]$ using the tree notation).
  • Figure 3: Examples of reconstructed volumes from NeuralVDB on Disney Cloud dataset disneyclouds
  • Figure 4: Ground truth SDF VDB models (top row) and reconstructed SDF VDB models using NeuralVDB (bottom row).
  • Figure 5: Reconstructing VDB from a NeuralVDB data. Virtual coordinates from level 1 are classified into either one of 1) child node, 2) active tile, 3) inactive tile. From the resulting vector at b), active mask coordinates are then further passed down to the tile value regressor to reconstruct the tile values at c). Input coordinates with child mask on are passed to the level-0 mask classifier to check active voxel state and then active voxel d) is finally used to infer the voxel value for the reconstruction of level 0 at e).
  • ...and 12 more figures