Table of Contents
Fetching ...

GSStream: 3D Gaussian Splatting based Volumetric Scene Streaming System

Zhiye Tang, Qiudan Zhang, Lei Zhang, Junhui Hou, You Yang, Xu Wang

TL;DR

GSStream integrates a collaborative viewport prediction module to better predict users' future behaviors by learning collaborative priors and historical priors from multiple users and users'viewport sequences and a deep reinforcement learning (DRL)-based bitrate adaptation module to tackle the state and action space variability challenge of the bitrate adaptation problem, achieving efficient volumetric scene delivery.

Abstract

Recently, the 3D Gaussian splatting (3DGS) technique for real-time radiance field rendering has revolutionized the field of volumetric scene representation, providing users with an immersive experience. But in return, it also poses a large amount of data volume, which is extremely bandwidth-intensive. Cutting-edge researchers have tried to introduce different approaches and construct multiple variants for 3DGS to obtain a more compact scene representation, but it is still challenging for real-time distribution. In this paper, we propose GSStream, a novel volumetric scene streaming system to support 3DGS data format. Specifically, GSStream integrates a collaborative viewport prediction module to better predict users' future behaviors by learning collaborative priors and historical priors from multiple users and users' viewport sequences and a deep reinforcement learning (DRL)-based bitrate adaptation module to tackle the state and action space variability challenge of the bitrate adaptation problem, achieving efficient volumetric scene delivery. Besides, we first build a user viewport trajectory dataset for volumetric scenes to support the training and streaming simulation. Extensive experiments prove that our proposed GSStream system outperforms existing representative volumetric scene streaming systems in visual quality and network usage. Demo video: https://youtu.be/3WEe8PN8yvA.

GSStream: 3D Gaussian Splatting based Volumetric Scene Streaming System

TL;DR

GSStream integrates a collaborative viewport prediction module to better predict users' future behaviors by learning collaborative priors and historical priors from multiple users and users'viewport sequences and a deep reinforcement learning (DRL)-based bitrate adaptation module to tackle the state and action space variability challenge of the bitrate adaptation problem, achieving efficient volumetric scene delivery.

Abstract

Recently, the 3D Gaussian splatting (3DGS) technique for real-time radiance field rendering has revolutionized the field of volumetric scene representation, providing users with an immersive experience. But in return, it also poses a large amount of data volume, which is extremely bandwidth-intensive. Cutting-edge researchers have tried to introduce different approaches and construct multiple variants for 3DGS to obtain a more compact scene representation, but it is still challenging for real-time distribution. In this paper, we propose GSStream, a novel volumetric scene streaming system to support 3DGS data format. Specifically, GSStream integrates a collaborative viewport prediction module to better predict users' future behaviors by learning collaborative priors and historical priors from multiple users and users' viewport sequences and a deep reinforcement learning (DRL)-based bitrate adaptation module to tackle the state and action space variability challenge of the bitrate adaptation problem, achieving efficient volumetric scene delivery. Besides, we first build a user viewport trajectory dataset for volumetric scenes to support the training and streaming simulation. Extensive experiments prove that our proposed GSStream system outperforms existing representative volumetric scene streaming systems in visual quality and network usage. Demo video: https://youtu.be/3WEe8PN8yvA.
Paper Structure (33 sections, 17 equations, 11 figures, 2 tables, 1 algorithm)

This paper contains 33 sections, 17 equations, 11 figures, 2 tables, 1 algorithm.

Figures (11)

  • Figure 1: Visualization effect of different volumetric scene streaming systems. Our proposed GSStream achieves the best visual performance among the state-of-the-art (SOTA) systems with the help of the collaborative viewport prediction module and the DRL-based bitrate adaptation module. Note that these examples are rendered on the stump scene under a network constraint of 120Mbps, at the 10th second (300th frame) of the viewing process.
  • Figure 2: Overview of the proposed GSStream system. ① In the pre-processing phase, the volumetric scene is tiled into $K$ tiles, and each is downsampled into representations in $L$ quality levels. ② Representations are selectively transmitted to the client side to reconstruct a 3DGS scene for display. ③ User's viewports captured by HMD are utilized to predict future viewports. ④ The predicted future viewports are then utilized to generate the representation selection strategy for the next time slot.
  • Figure 3: Viewport trajectories captured from four of the subjects. Different subjects have different behavioral characteristics in viewing volumetric scenes. For example, subjects like 07 and 16 prefer to surround the scenes randomly, while subjects like 05 and 22 tend to stand in fixed positions with fewer movements.
  • Figure 4: Illustration of the CVP module. The network utilizes both historical information and collaborative information.
  • Figure 5: The illustration of the whole architecture of the proposed DRL-based bitrate adaptation algorithm. The proposed algorithm is based on DDPG, which is under the Actor-Critic framework konda1999actor, with the actor head $\pi(\mathbf{s}_t)$ outputting an action $\mathbf{a}_t$, and the critic head outputting the corresponding action value $Q\left(\mathbf{s}_t,\mathbf{a}_t\right)$.
  • ...and 6 more figures