Table of Contents
Fetching ...

Compression of Large-Scale 3D Point Clouds Based on Joint Optimization of Point Sampling and Feature Extraction

Jae-Young Yim, Jae-Young Sim

TL;DR

This work tackles the compression of large-scale 3D LiDAR point clouds by addressing the inefficiency of separating point sampling from feature extraction. It introduces a fully end-to-end framework that jointly optimizes a learnable point sampling network and adaptive feature aggregation within a range-image partitioning scheme, coupled with an expansion-and-fusion decoder for robust reconstruction. The method employs a hyperprior-based entropy model to approximate rate and defines a rate-distortion objective L = D + λR, achieving significantly better rate-distortion performance than state-of-the-art baselines on SemanticKITTI and nuScenes, including strong cross-dataset generalization. The approach reduces encoding time and memory while delivering higher fidelity reconstructions, making LS3DPC compression more practical for large-scale applications.

Abstract

Large-scale 3D point clouds (LS3DPC) obtained by LiDAR scanners require huge storage space and transmission bandwidth due to a large amount of data. The existing methods of LS3DPC compression separately perform rule-based point sampling and learnable feature extraction, and hence achieve limited compression performance. In this paper, we propose a fully end-to-end training framework for LS3DPC compression where the point sampling and the feature extraction are jointly optimized in terms of the rate and distortion losses. To this end, we first make the point sampling module to be trainable such that an optimal position of the downsampled point is estimated via aggregation with learnable weights. We also develop a reliable point reconstruction scheme that adaptively aggregates the expanded candidate points to refine the positions of upsampled points. Experimental results evaluated on the SemanticKITTI and nuScenes datasets show that the proposed method achieves significantly higher compression ratios compared with the existing state-of-the-art methods.

Compression of Large-Scale 3D Point Clouds Based on Joint Optimization of Point Sampling and Feature Extraction

TL;DR

This work tackles the compression of large-scale 3D LiDAR point clouds by addressing the inefficiency of separating point sampling from feature extraction. It introduces a fully end-to-end framework that jointly optimizes a learnable point sampling network and adaptive feature aggregation within a range-image partitioning scheme, coupled with an expansion-and-fusion decoder for robust reconstruction. The method employs a hyperprior-based entropy model to approximate rate and defines a rate-distortion objective L = D + λR, achieving significantly better rate-distortion performance than state-of-the-art baselines on SemanticKITTI and nuScenes, including strong cross-dataset generalization. The approach reduces encoding time and memory while delivering higher fidelity reconstructions, making LS3DPC compression more practical for large-scale applications.

Abstract

Large-scale 3D point clouds (LS3DPC) obtained by LiDAR scanners require huge storage space and transmission bandwidth due to a large amount of data. The existing methods of LS3DPC compression separately perform rule-based point sampling and learnable feature extraction, and hence achieve limited compression performance. In this paper, we propose a fully end-to-end training framework for LS3DPC compression where the point sampling and the feature extraction are jointly optimized in terms of the rate and distortion losses. To this end, we first make the point sampling module to be trainable such that an optimal position of the downsampled point is estimated via aggregation with learnable weights. We also develop a reliable point reconstruction scheme that adaptively aggregates the expanded candidate points to refine the positions of upsampled points. Experimental results evaluated on the SemanticKITTI and nuScenes datasets show that the proposed method achieves significantly higher compression ratios compared with the existing state-of-the-art methods.

Paper Structure

This paper contains 23 sections, 6 equations, 10 figures, 1 table.

Figures (10)

  • Figure 1: The concept of the proposed joint optimization framework between learnable point sampling and feature extraction, compared to the existing methods where the rule-based sampling is performed separately from the feature extraction.
  • Figure 2: Overall procedures of the proposed method.
  • Figure 3: Network architecture of the compression module where the point sampling and feature aggregation are jointly trained.
  • Figure 4: Network architecture of the decompression module where the point reconstruction is performed via expansion and fusion.
  • Figure 5: Rate and distortion curves of the proposed method compared with that of the three state-of-the-art methods: D-PCC dpcc, 3QNet 3qnet, and DEPOCO depoco. (a) Bitrate vs. CD curves and (b) Bitrate vs. PSNR curves are evaluated on SemanticKITTI dataset. (c) Bitrate vs. CD curves and (d) Bitrate vs. PSNR curves are evaluated on nuScenes dataset. The training and test data are from the same dataset.
  • ...and 5 more figures