Table of Contents
Fetching ...

Lossless Point Cloud Geometry and Attribute Compression Using a Learned Conditional Probability Model

Dat Thanh Nguyen, Andre Kaup

TL;DR

An efficient lossless point cloud compression method that uses sparse tensor-based deep neural networks to learn point cloud geometry and color probability distributions and builds an accurate auto-regressive context model for an arithmetic coder is presented.

Abstract

In recent years, we have witnessed the presence of point cloud data in many aspects of our life, from immersive media, autonomous driving to healthcare, although at the cost of a tremendous amount of data. In this paper, we present an efficient lossless point cloud compression method that uses sparse tensor-based deep neural networks to learn point cloud geometry and color probability distributions. Our method represents a point cloud with both occupancy feature and three attribute features at different bit depths in a unified sparse representation. This allows us to efficiently exploit feature-wise and point-wise dependencies within point clouds using a sparse tensor-based neural network and thus build an accurate auto-regressive context model for an arithmetic coder. To the best of our knowledge, this is the first learning-based lossless point cloud geometry and attribute compression approach. Compared with the-state-of-the-art lossless point cloud compression method from Moving Picture Experts Group (MPEG), our method achieves 22.6% reduction in total bitrate on a diverse set of test point clouds while having 49.0% and 18.3% rate reduction on geometry and color attribute component, respectively.

Lossless Point Cloud Geometry and Attribute Compression Using a Learned Conditional Probability Model

TL;DR

An efficient lossless point cloud compression method that uses sparse tensor-based deep neural networks to learn point cloud geometry and color probability distributions and builds an accurate auto-regressive context model for an arithmetic coder is presented.

Abstract

In recent years, we have witnessed the presence of point cloud data in many aspects of our life, from immersive media, autonomous driving to healthcare, although at the cost of a tremendous amount of data. In this paper, we present an efficient lossless point cloud compression method that uses sparse tensor-based deep neural networks to learn point cloud geometry and color probability distributions. Our method represents a point cloud with both occupancy feature and three attribute features at different bit depths in a unified sparse representation. This allows us to efficiently exploit feature-wise and point-wise dependencies within point clouds using a sparse tensor-based neural network and thus build an accurate auto-regressive context model for an arithmetic coder. To the best of our knowledge, this is the first learning-based lossless point cloud geometry and attribute compression approach. Compared with the-state-of-the-art lossless point cloud compression method from Moving Picture Experts Group (MPEG), our method achieves 22.6% reduction in total bitrate on a diverse set of test point clouds while having 49.0% and 18.3% rate reduction on geometry and color attribute component, respectively.
Paper Structure (15 sections, 8 equations, 11 figures, 8 tables)

This paper contains 15 sections, 8 equations, 11 figures, 8 tables.

Figures (11)

  • Figure 1: Architecture overview of the proposed method. (a): a $n$ bit depth point cloud is partitioned down to the $n-6$ octree level, yielding $b$ occupied blocks of size $64\times 64 \times 64$ ($B_1$ to $B_b$). (b): Each block contains four features including geometry $f^0$ and three color features $f^{1-3}$. We encode feature by feature sequentially from $f^0$ to $f^3$, the latter feature is encoded conditionally based on previously encoded features. At this figure, the last feature $f^3$ is being encoded. The top bar shows the output bitstream, the high level octree is stored first, the feature bitstreams of each block $B_m$ are marked with corresponding colors.
  • Figure 2: An example of causal context in a ${4\times4}$ block. The bottom layer is occupancy feature, the next two layers are color features. The empty space is indicated with white color. The area inside the yellow boundaries are context to encode the current sub-point ${f^2_{\textcolor{black}{i}}}$. Red arrows show the 3D raster-scan order which is also the encoding order. For visualization simplicity, we do not show the prediction of ${f^3}$.
  • Figure 3: CNeT neural network architectures. (a): Occupancy CNeT (O-CNeT), (b): Color feature CNeT (C-CNeT). Masks are applied in all convolution layers in both networks except for the previous feature path colored in bright brown. Type A masks are applied in the first layer of O-CNeT and C-CNeTs while type B masks are applied in the subsequent layers including Resnet block. In the Resnet blocks, the middle convolution layers has $kernel=3$ while the two other layers have $kernel=1$.
  • Figure 4: Sparse mask construction and application
  • Figure 5: From left to right: original Red$\&$Black point cloud from MPEG 8i on RGB color space and the representation of the individual color component Y, chrominance orange Co and chrominance green Cg.
  • ...and 6 more figures