RENO: Real-Time Neural Compression for 3D LiDAR Point Clouds
Kang You, Tong Chen, Dandan Ding, M. Salman Asif, Zhan Ma
TL;DR
RENO tackles the real-time LPCC bottleneck by eliminating time-consuming octree generation and multi-stage upsampling, replacing them with multiscale sparse tensors and sparse occupancy codes. A cross-scale context model (TOP) predicts 8-bit occupancy codes in a one-shot, scale-wise fashion, while fast converters (FOG/FCG) enable parallel, low-latency code generation and reconstruction. The bitwise two-stage coding further reduces arithmetic-coding latency, enabling ~10–20 fps on standard GPUs for 12–14 bit LiDAR frames, with BD-BR gains over G-PCCv23 and Draco and competitive downstream task performance. The approach yields a compact 1 MB model that still outperforms existing real-time compressors in rate-distortion and preserves geometry well enough for effective 3D object detection, making it attractive for on-device or vehicle-to-vehicle LiDAR data sharing. Collectively, RENO demonstrates that carefully designed cross-scale occupancy coding and efficient sparse-tensor processing can achieve real-time neural LPCC without sacrificing key reconstruction and downstream task capabilities.
Abstract
Despite the substantial advancements demonstrated by learning-based neural models in the LiDAR Point Cloud Compression (LPCC) task, realizing real-time compression - an indispensable criterion for numerous industrial applications - remains a formidable challenge. This paper proposes RENO, the first real-time neural codec for 3D LiDAR point clouds, achieving superior performance with a lightweight model. RENO skips the octree construction and directly builds upon the multiscale sparse tensor representation. Instead of the multi-stage inferring, RENO devises sparse occupancy codes, which exploit cross-scale correlation and derive voxels' occupancy in a one-shot manner, greatly saving processing time. Experimental results demonstrate that the proposed RENO achieves real-time coding speed, 10 fps at 14-bit depth on a desktop platform (e.g., one RTX 3090 GPU) for both encoding and decoding processes, while providing 12.25% and 48.34% bit-rate savings compared to G-PCCv23 and Draco, respectively, at a similar quality. RENO model size is merely 1MB, making it attractive for practical applications. The source code is available at https://github.com/NJUVISION/RENO.
