End-to-end learned Lossy Dynamic Point Cloud Attribute Compression
Dat Thanh Nguyen, Daniel Zieger, Marc Stamminger, Andre Kaup
TL;DR
This work tackles dynamic point cloud attribute compression by proposing an end-to-end learned framework based on a variational autoencoder that encodes attributes into a latent variable $f$ and uses a spatiotemporal auto-regressive context for entropy coding. The method jointly optimizes a rate-distortion objective and models the latent prior with auto-regressive and temporal dependencies, enabling efficient bitstream encoding. With experiments on MPEG 8i and MVUB datasets, it reports substantial BD-rate savings of $38.1\%$ and ~1.44 dB BD-quality gains at the same bitrate, while maintaining low encoding/decoding complexity compared with the RAHT-based MPEG G-PCC core module. The results demonstrate strong potential for end-to-end learned dynamic attribute coding and indicate avenues for extending to multiple modalities and further reducing complexity.
Abstract
Recent advancements in point cloud compression have primarily emphasized geometry compression while comparatively fewer efforts have been dedicated to attribute compression. This study introduces an end-to-end learned dynamic lossy attribute coding approach, utilizing an efficient high-dimensional convolution to capture extensive inter-point dependencies. This enables the efficient projection of attribute features into latent variables. Subsequently, we employ a context model that leverage previous latent space in conjunction with an auto-regressive context model for encoding the latent tensor into a bitstream. Evaluation of our method on widely utilized point cloud datasets from the MPEG and Microsoft demonstrates its superior performance compared to the core attribute compression module Region-Adaptive Hierarchical Transform method from MPEG Geometry Point Cloud Compression with 38.1% Bjontegaard Delta-rate saving in average while ensuring a low-complexity encoding/decoding.
