Table of Contents
Fetching ...

Rendering-Oriented 3D Point Cloud Attribute Compression using Sparse Tensor-based Transformer

Xiao Huo, Junhui Hou, Shuai Wan, Fuzheng Yang

TL;DR

An end-to-end deep learning framework that seamlessly integrates PCAC with differentiable rendering, denoted as rendering-oriented PCAC (RO-PCAC), directly targeting the quality of rendered multiview images for viewing, achieves state-of-the-art compression performance.

Abstract

The evolution of 3D visualization techniques has fundamentally transformed how we interact with digital content. At the forefront of this change is point cloud technology, offering an immersive experience that surpasses traditional 2D representations. However, the massive data size of point clouds presents significant challenges in data compression. Current methods for lossy point cloud attribute compression (PCAC) generally focus on reconstructing the original point clouds with minimal error. However, for point cloud visualization scenarios, the reconstructed point clouds with distortion still need to undergo a complex rendering process, which affects the final user-perceived quality. In this paper, we propose an end-to-end deep learning framework that seamlessly integrates PCAC with differentiable rendering, denoted as rendering-oriented PCAC (RO-PCAC), directly targeting the quality of rendered multiview images for viewing. In a differentiable manner, the impact of the rendering process on the reconstructed point clouds is taken into account. Moreover, we characterize point clouds as sparse tensors and propose a sparse tensor-based transformer, called SP-Trans. By aligning with the local density of the point cloud and utilizing an enhanced local attention mechanism, SP-Trans captures the intricate relationships within the point cloud, further improving feature analysis and synthesis within the framework. Extensive experiments demonstrate that the proposed RO-PCAC achieves state-of-the-art compression performance, compared to existing reconstruction-oriented methods, including traditional, learning-based, and hybrid methods.

Rendering-Oriented 3D Point Cloud Attribute Compression using Sparse Tensor-based Transformer

TL;DR

An end-to-end deep learning framework that seamlessly integrates PCAC with differentiable rendering, denoted as rendering-oriented PCAC (RO-PCAC), directly targeting the quality of rendered multiview images for viewing, achieves state-of-the-art compression performance.

Abstract

The evolution of 3D visualization techniques has fundamentally transformed how we interact with digital content. At the forefront of this change is point cloud technology, offering an immersive experience that surpasses traditional 2D representations. However, the massive data size of point clouds presents significant challenges in data compression. Current methods for lossy point cloud attribute compression (PCAC) generally focus on reconstructing the original point clouds with minimal error. However, for point cloud visualization scenarios, the reconstructed point clouds with distortion still need to undergo a complex rendering process, which affects the final user-perceived quality. In this paper, we propose an end-to-end deep learning framework that seamlessly integrates PCAC with differentiable rendering, denoted as rendering-oriented PCAC (RO-PCAC), directly targeting the quality of rendered multiview images for viewing. In a differentiable manner, the impact of the rendering process on the reconstructed point clouds is taken into account. Moreover, we characterize point clouds as sparse tensors and propose a sparse tensor-based transformer, called SP-Trans. By aligning with the local density of the point cloud and utilizing an enhanced local attention mechanism, SP-Trans captures the intricate relationships within the point cloud, further improving feature analysis and synthesis within the framework. Extensive experiments demonstrate that the proposed RO-PCAC achieves state-of-the-art compression performance, compared to existing reconstruction-oriented methods, including traditional, learning-based, and hybrid methods.

Paper Structure

This paper contains 28 sections, 10 equations, 14 figures, 9 tables.

Figures (14)

  • Figure 1: Visual comparison of reconstructed point clouds of Loot in similar bitrates. (b) is the proposed RO-PCAC without rendering module, which is trained by reconstruction loss. (c) is the proposed fully rendering-oriented RO-PCAC.
  • Figure 2: Overview of Rendering-Oriented Point Cloud Attribute Compression Framework. This framework comprises a compression module for feature analysis/synthesis, quantization (Q), and entropy encoding/decoding, and a rendering module for transforming, rasterization, and compositing of the point cloud data to multiview images. The rate-distortion (R-D) loss includes the estimated bitrate and the image error between original and rendered multiview images.
  • Figure 3: The feature analysis and synthesis of the compression module. Except for the last convolution, each convolution is followed by a Rectified Linear Unit (ReLu) layer.
  • Figure 4: Two methods for searching nearest neighbors.
  • Figure 5: The architecture of SP-Trans.
  • ...and 9 more figures