PU-Ray: Domain-Independent Point Cloud Upsampling via Ray Marching on Neural Implicit Surface
Sangwon Lim, Karim El-Basyouny, Yee Hong Yang
TL;DR
This paper tackles domain dependency in LiDAR point cloud upsampling by reframing upsampling as depth prediction along query rays on a neural implicit surface defined by a UDF. The method, PU‑Ray, uses a ray marching pipeline driven by a Point Transformer encoder to define an implicit surface and predict ray depths, enabling an arbitrary upsampling rate $r$ via $|Q| = |S| imes(r-1)$ and supporting both supervised and self‑supervised training. Key contributions include the neural implicit surface with $MLP_I$ and $MLP_ ext{ε}$, a novel rule‑based query generation scheme for uniform sampling, and ablations showing efficiency with a small parameter count while achieving state‑of‑the‑art metrics on synthetic datasets and robustness on real scans. The results suggest that ray‑based upsampling over a local implicit surface can generalize across domains and ROI configurations, with practical impact for 3D reconstruction and ITS applications, while leaving room for acceleration in real‑time industrial contexts.
Abstract
While recent advancements in deep-learning point cloud upsampling methods have improved the input to intelligent transportation systems, they still suffer from issues of domain dependency between synthetic and real-scanned point clouds. This paper addresses the above issues by proposing a new ray-based upsampling approach with an arbitrary rate, where a depth prediction is made for each query ray and its corresponding patch. Our novel method simulates the sphere-tracing ray marching algorithm on the neural implicit surface defined with an unsigned distance function (UDF) to achieve more precise and stable ray-depth predictions by training a point-transformer-based network. The rule-based mid-point query sampling method generates more evenly distributed points without requiring an end-to-end model trained using a nearest-neighbor-based reconstruction loss function, which may be biased towards the training dataset. Self-supervised learning becomes possible with accurate ground truths within the input point cloud. The results demonstrate the method's versatility across domains and training scenarios with limited computational resources and training data. Comprehensive analyses of synthetic and real-scanned applications provide empirical evidence for the significance of the upsampling task across the computer vision and graphics domains to real-world applications of ITS.
