Table of Contents
Fetching ...

Real-Time Neural-Enhancement for Online Cloud Gaming

Shan Jiang, Zhenhua Han, Haisheng Tan, Xinyang Jiang, Yifan Yang, Xiaoxi Zhang, Hongqiu Ni, Yuqing Yang, Xiang-Yang Li

TL;DR

River tackles the real-time quality challenge of online cloud gaming by reusing pre-tuned SR networks through content-aware retrieval, rather than per-segment fine-tuning. It introduces a content-aware encoder to build a compact lookup table of SR representations, an online scheduler to select or trigger fine-tuning, and a prefetching strategy to balance video and model bandwidth. Empirical results show a 44% reduction in training overhead and an average PSNR improvement of 1.81 dB over baselines, with end-to-end latency suitable for mobile devices (~50 ms) and about 720p/20fps feasibility. This approach significantly reduces computational waste and bandwidth cost while delivering real-time neural-enhanced video for cloud gaming.

Abstract

Online Cloud gaming demands real-time, high-quality video transmission across variable wide-area networks (WANs). Neural-enhanced video transmission algorithms employing super-resolution (SR) for video quality enhancement have effectively challenged WAN environments. However, these SR-based methods require intensive fine-tuning for the whole video, making it infeasible in diverse online cloud gaming. To address this, we introduce River, a cloud gaming delivery framework designed based on the observation that video segment features in cloud gaming are typically repetitive and redundant. This permits a significant opportunity to reuse fine-tuned SR models, reducing the fine-tuning latency of minutes to query latency of milliseconds. To enable the idea, we design a practical system that addresses several challenges, such as model organization, online model scheduler, and transfer strategy. River first builds a content-aware encoder that fine-tunes SR models for diverse video segments and stores them in a lookup table. When delivering cloud gaming video streams online, River checks the video features and retrieves the most relevant SR models to enhance the frame quality. Meanwhile, if no existing SR model performs well enough for some video segments, River will further fine-tune new models and update the lookup table. Finally, to avoid the overhead of streaming model weight to the clients, River designs a prefetching strategy that predicts the models with the highest possibility of being retrieved. Our evaluation based on real video game streaming demonstrates River can reduce redundant training overhead by 44% and improve the Peak-Signal-to-Noise-Ratio by 1.81dB compared to the SOTA solutions. Practical deployment shows River meets real-time requirements, achieving approximately 720p 20fps on mobile devices.

Real-Time Neural-Enhancement for Online Cloud Gaming

TL;DR

River tackles the real-time quality challenge of online cloud gaming by reusing pre-tuned SR networks through content-aware retrieval, rather than per-segment fine-tuning. It introduces a content-aware encoder to build a compact lookup table of SR representations, an online scheduler to select or trigger fine-tuning, and a prefetching strategy to balance video and model bandwidth. Empirical results show a 44% reduction in training overhead and an average PSNR improvement of 1.81 dB over baselines, with end-to-end latency suitable for mobile devices (~50 ms) and about 720p/20fps feasibility. This approach significantly reduces computational waste and bandwidth cost while delivering real-time neural-enhanced video for cloud gaming.

Abstract

Online Cloud gaming demands real-time, high-quality video transmission across variable wide-area networks (WANs). Neural-enhanced video transmission algorithms employing super-resolution (SR) for video quality enhancement have effectively challenged WAN environments. However, these SR-based methods require intensive fine-tuning for the whole video, making it infeasible in diverse online cloud gaming. To address this, we introduce River, a cloud gaming delivery framework designed based on the observation that video segment features in cloud gaming are typically repetitive and redundant. This permits a significant opportunity to reuse fine-tuned SR models, reducing the fine-tuning latency of minutes to query latency of milliseconds. To enable the idea, we design a practical system that addresses several challenges, such as model organization, online model scheduler, and transfer strategy. River first builds a content-aware encoder that fine-tunes SR models for diverse video segments and stores them in a lookup table. When delivering cloud gaming video streams online, River checks the video features and retrieves the most relevant SR models to enhance the frame quality. Meanwhile, if no existing SR model performs well enough for some video segments, River will further fine-tune new models and update the lookup table. Finally, to avoid the overhead of streaming model weight to the clients, River designs a prefetching strategy that predicts the models with the highest possibility of being retrieved. Our evaluation based on real video game streaming demonstrates River can reduce redundant training overhead by 44% and improve the Peak-Signal-to-Noise-Ratio by 1.81dB compared to the SOTA solutions. Practical deployment shows River meets real-time requirements, achieving approximately 720p 20fps on mobile devices.
Paper Structure (23 sections, 6 equations, 17 figures, 5 tables, 3 algorithms)

This paper contains 23 sections, 6 equations, 17 figures, 5 tables, 3 algorithms.

Figures (17)

  • Figure 1: Neural-enhanced Video Delivery Frameworks.
  • Figure 2: Performance degradation due to training delay.
  • Figure 3: River System Overview.
  • Figure 4: Comparison of PNSR among different SR models on different video segments.
  • Figure 5: Original frame.
  • ...and 12 more figures