NoKSR: Kernel-Free Neural Surface Reconstruction via Point Cloud Serialization
Zhen Li, Weiwei Sun, Shrisudhan Govindarajan, Shaobo Xia, Daniel Rebain, Kwang Moo Yi, Andrea Tagliasacchi
TL;DR
NoKSR tackles large-scale surface reconstruction from irregular point clouds by predicting a neural signed distance field without kernel-based voxelization. It capitalizes on locality-preserving point cloud serialization and a lightweight multi-scale aggregation built on PointTransformerV3 to efficiently gather local context across several serialization levels, then decodes to $D(\mathbf{q})$ via a simple MLP. The method introduces a Hilbert-curve based neighborhood function to retrieve approximate neighbors with minimal information loss, and it uses a triad of losses $\mathcal{L}_{\text{SDF}}$, $\mathcal{L}_{\text{Eikonal}}$, and $\mathcal{L}_{\text{mask}}$ to ensure accurate, distance-field-like surfaces. Empirically, NoKSR delivers state-of-the-art accuracy and substantially lower latency on outdoor datasets, while maintaining strong indoor performance, and it demonstrates robust cross-domain generalization and informative ablations. This point-based, kernel-free approach mitigates the information loss of voxelization and shows practical potential for scalable, high-fidelity surface reconstruction in real-world scenarios.
Abstract
We present a novel approach to large-scale point cloud surface reconstruction by developing an efficient framework that converts an irregular point cloud into a signed distance field (SDF). Our backbone builds upon recent transformer-based architectures (i.e., PointTransformerV3), that serializes the point cloud into a locality-preserving sequence of tokens. We efficiently predict the SDF value at a point by aggregating nearby tokens, where fast approximate neighbors can be retrieved thanks to the serialization. We serialize the point cloud at different levels/scales, and non-linearly aggregate a feature to predict the SDF value. We show that aggregating across multiple scales is critical to overcome the approximations introduced by the serialization (i.e. false negatives in the neighborhood). Our frameworks sets the new state-of-the-art in terms of accuracy and efficiency (better or similar performance with half the latency of the best prior method, coupled with a simpler implementation), particularly on outdoor datasets where sparse-grid methods have shown limited performance.
