Range Image-Based Implicit Neural Compression for LiDAR Point Clouds
Akihiro Kuwabara, Sorachi Kato, Takuya Fujihashi, Toshiaki Koike-Akino, Takashi Watanabe
TL;DR
This paper tackles the challenge of efficiently compressing high-precision LiDAR range images by treating them as floating-point 2D signals. It introduces a dual implicit neural representation (INR) encoding scheme that separately learns a mask image and a depth image, with a patch-wise depth INR to capture high-frequency depth variations and a pixel-wise mask INR for accurate point presence. The INR parameters are progressively pruned and quantized, then entropy-coded to yield compact bitstreams, and decoding reconstructs the RI before converting it back to 3D points. Evaluations on KITTI show the approach yields better rate-distortion performance than image-based, RI-based, and other INR-based baselines at low bitrates, while maintaining reasonable decoding latency and delivering improved downstream 3D object detection accuracy. The method enables high-quality offline LiDAR scene archives, with future work aimed at reducing encoding time and mitigating point-loss during 2D-to-3D reconstruction.
Abstract
This paper presents a novel scheme to efficiently compress Light Detection and Ranging~(LiDAR) point clouds, enabling high-precision 3D scene archives, and such archives pave the way for a detailed understanding of the corresponding 3D scenes. We focus on 2D range images~(RIs) as a lightweight format for representing 3D LiDAR observations. Although conventional image compression techniques can be adapted to improve compression efficiency for RIs, their practical performance is expected to be limited due to differences in bit precision and the distinct pixel value distribution characteristics between natural images and RIs. We propose a novel implicit neural representation~(INR)--based RI compression method that effectively handles floating-point valued pixels. The proposed method divides RIs into depth and mask images and compresses them using patch-wise and pixel-wise INR architectures with model pruning and quantization, respectively. Experiments on the KITTI dataset show that the proposed method outperforms existing image, point cloud, RI, and INR-based compression methods in terms of 3D reconstruction and detection quality at low bitrates and decoding latency.
