Oriented-grid Encoder for 3D Implicit Representations
Arihant Gaur, G. Dias Pais, Pedro Miraldo
TL;DR
The paper addresses efficient and accurate 3D implicit representations by introducing an oriented-grid encoder that aligns multi-resolution grid cells to surface normals. It combines a dual-tree structure (structured octree and orientation tree) with a cylindrical interpolation scheme and a shared 3D CNN for local feature aggregation, producing rotation-invariant and smoother features. The method achieves state-of-the-art results across ABC, Thingi10k, ShapeNet, and Matterport3D, with faster convergence and sharper surfaces than regular-grid baselines. This approach holds promise for more robust object and scene reconstructions and may extend to neural radiance fields and large-scale scenes.
Abstract
Encoding 3D points is one of the primary steps in learning-based implicit scene representation. Using features that gather information from neighbors with multi-resolution grids has proven to be the best geometric encoder for this task. However, prior techniques do not exploit some characteristics of most objects or scenes, such as surface normals and local smoothness. This paper is the first to exploit those 3D characteristics in 3D geometric encoders explicitly. In contrast to prior work on using multiple levels of details, regular cube grids, and trilinear interpolation, we propose 3D-oriented grids with a novel cylindrical volumetric interpolation for modeling local planar invariance. In addition, we explicitly include a local feature aggregation for feature regularization and smoothing of the cylindrical interpolation features. We evaluate our approach on ABC, Thingi10k, ShapeNet, and Matterport3D, for object and scene representation. Compared to the use of regular grids, our geometric encoder is shown to converge in fewer steps and obtain sharper 3D surfaces. When compared to the prior techniques, our method gets state-of-the-art results.
