Content Adaptive Encoding For Interactive Game Streaming
Shakarim Soltanayev, Odysseas Zisimopoulos, Mohammad Ashraful Anam, Man Cheung Kung, Angeliki Katsenou, Yiannis Andreopoulos
TL;DR
This work tackles the challenge of content-adaptive encoding for interactive game streaming by introducing CAE-IGS, a lightweight CNN-based approach that predicts the next scene's resolution from past per-line HEVC statistics. The method operates without buffering or encoder lookahead, adapting only at scene changes to keep latency at a minimum. By exploring rate-quality convex hulls and training zone-specific predictors, CAE-IGS achieves statistically significant quality gains and bitrate reductions over a static HEVC ladder, approaching the offline Optimal Ladder while maintaining practical, millisecond-scale CPU inference. The results demonstrate CAE-IGS as a feasible, zero-latency CAE solution for IGS with real-world applicability and room for further extension.
Abstract
Video-on-demand streaming has benefitted from \textit{content-adaptive encoding} (CAE), i.e., adaptation of resolution and/or quantization parameters for each scene based on convex hull optimization. However, CAE is very challenging to develop and deploy for interactive game streaming (IGS). Commercial IGS services impose ultra-low latency encoding with no lookahead or buffering, and have extremely tight compute constraints for any CAE algorithm execution. We propose the first CAE approach for resolution adaptation in IGS based on compact encoding metadata from past frames. Specifically, we train a convolutional neural network (CNN) to infer the best resolution from the options available for the upcoming scene based on a running window of aggregated coding block statistics from the current scene. By deploying the trained CNN within a practical IGS setup based on HEVC encoding, our proposal: (i) improves over the default fixed-resolution ladder of HEVC by 2.3 Bjøntegaard Delta-VMAF points; (ii) infers using 1ms of a single CPU core per scene, thereby having no latency overhead.
