Table of Contents
Fetching ...

Progressive Frame Patching for FoV-based Point Cloud Video Streaming

Tongyu Zong, Yixiang Mao, Chen Li, Yong Liu, Yao Wang

TL;DR

This work tackles the high bandwidth and latency challenges of volumetric point cloud video streaming for 6-DoF viewing. It introduces a sliding-window, multi-round progressive patching framework that exploits octree-based spatial scalability and tile-based FoV adaptation, paired with a view-distance aware tile utility model and an analytical KKT-based water-filling rate allocation. The approach demonstrates robustness to FoV and bandwidth prediction errors and outperforms non-progressive and heuristic baselines on real PCV datasets and traces. The resulting method offers smoother, higher-quality view rendering in dynamic network and viewing conditions, with practical implications for scalable FoV-adaptive PCV streaming.

Abstract

Many XR applications require the delivery of volumetric video to users with six degrees of freedom (6-DoF) movements. Point Cloud has become a popular volumetric video format. A dense point cloud consumes much higher bandwidth than a 2D/360 degree video frame. User Field of View (FoV) is more dynamic with 6-DoF movement than 3-DoF movement. To save bandwidth, FoV-adaptive streaming predicts a user's FoV and only downloads point cloud data falling in the predicted FoV. However, it is vulnerable to FoV prediction errors, which can be significant when a long buffer is utilized for smoothed streaming. In this work, we propose a multi-round progressive refinement framework for point cloud video streaming. Instead of sequentially downloading point cloud frames, our solution simultaneously downloads/patches multiple frames falling into a sliding time-window, leveraging the inherent scalability of octree-based point-cloud coding. The optimal rate allocation among all tiles of active frames are solved analytically using the heterogeneous tile rate-quality functions calibrated by the predicted user FoV. Multi-frame downloading/patching simultaneously takes advantage of the streaming smoothness resulting from long buffer and the FoV prediction accuracy at short buffer length. We evaluate our streaming solution using simulations driven by real point cloud videos, real bandwidth traces, and 6-DoF FoV traces of real users. Our solution is robust against the bandwidth/FoV prediction errors, and can deliver high and smooth view quality in the face of bandwidth variations and dynamic user and point cloud movements.

Progressive Frame Patching for FoV-based Point Cloud Video Streaming

TL;DR

This work tackles the high bandwidth and latency challenges of volumetric point cloud video streaming for 6-DoF viewing. It introduces a sliding-window, multi-round progressive patching framework that exploits octree-based spatial scalability and tile-based FoV adaptation, paired with a view-distance aware tile utility model and an analytical KKT-based water-filling rate allocation. The approach demonstrates robustness to FoV and bandwidth prediction errors and outperforms non-progressive and heuristic baselines on real PCV datasets and traces. The resulting method offers smoother, higher-quality view rendering in dynamic network and viewing conditions, with practical implications for scalable FoV-adaptive PCV streaming.

Abstract

Many XR applications require the delivery of volumetric video to users with six degrees of freedom (6-DoF) movements. Point Cloud has become a popular volumetric video format. A dense point cloud consumes much higher bandwidth than a 2D/360 degree video frame. User Field of View (FoV) is more dynamic with 6-DoF movement than 3-DoF movement. To save bandwidth, FoV-adaptive streaming predicts a user's FoV and only downloads point cloud data falling in the predicted FoV. However, it is vulnerable to FoV prediction errors, which can be significant when a long buffer is utilized for smoothed streaming. In this work, we propose a multi-round progressive refinement framework for point cloud video streaming. Instead of sequentially downloading point cloud frames, our solution simultaneously downloads/patches multiple frames falling into a sliding time-window, leveraging the inherent scalability of octree-based point-cloud coding. The optimal rate allocation among all tiles of active frames are solved analytically using the heterogeneous tile rate-quality functions calibrated by the predicted user FoV. Multi-frame downloading/patching simultaneously takes advantage of the streaming smoothness resulting from long buffer and the FoV prediction accuracy at short buffer length. We evaluate our streaming solution using simulations driven by real point cloud videos, real bandwidth traces, and 6-DoF FoV traces of real users. Our solution is robust against the bandwidth/FoV prediction errors, and can deliver high and smooth view quality in the face of bandwidth variations and dynamic user and point cloud movements.
Paper Structure (28 sections, 19 equations, 13 figures, 6 tables, 2 algorithms)

This paper contains 28 sections, 19 equations, 13 figures, 6 tables, 2 algorithms.

Figures (13)

  • Figure 1: Octree-based Scalable and FoV-adaptive Point Cloud Coding
  • Figure 2: Progressive Streaming Example with Sliding-window Size of $3$: in round $i\Delta$, the base layer of all tiles of segment $i+3$ within the predicted FoV are being downloaded for the first time, while tiles of segment $i+2$ falling into the predicted FoV are being patched with enhancement layers. Tiles of segment $i+3$ will be patched in the next round $(i+1)\Delta$.
  • Figure 3: Tile Angular Resolution depends on distance between the viewer and the tile, and suboctree's height within the tile.
  • Figure 4: Tile Utility Curve.
  • Figure 5: Compare KKT-const and non-progressive
  • ...and 8 more figures