Adaptive Rate Control for Deep Video Compression with Rate-Distortion Prediction
Bowen Gu, Hao Chen, Ming Lu, Jie Yao, Zhan Ma
TL;DR
This work introduces a one-pass neural network–based $\lambda$-domain rate control for deep video compression that learns per-frame $R$-$\lambda$ and $D$-$\lambda$ relationships directly from uncompressed frames without pre-encoding. A shared prediction module outputs per-frame RD-$\lambda$ samples, which are fitted to derive per-frame curves and guide a mini-GOP level rate-control optimization; a distortion addition mechanism bridges reference-frame mismatch, and inputs are down-sampled for efficiency. The proposed method achieves high rate-control accuracy with moderate time overhead and substantially reduces inter-frame quality fluctuations across resolutions and codecs. Experiments show favorable performance against multi-pass and one-pass baselines on diverse datasets and two deep codecs, highlighting practical gains for real-time or bandwidth-fluctuating streaming scenarios.
Abstract
Deep video compression has made significant progress in recent years, achieving rate-distortion performance that surpasses that of traditional video compression methods. However, rate control schemes tailored for deep video compression have not been well studied. In this paper, we propose a neural network-based $λ$-domain rate control scheme for deep video compression, which determines the coding parameter $λ$ for each to-be-coded frame based on the rate-distortion-$λ$ (R-D-$λ$) relationships directly learned from uncompressed frames, achieving high rate control accuracy efficiently without the need for pre-encoding. Moreover, this content-aware scheme is able to mitigate inter-frame quality fluctuations and adapt to abrupt changes in video content. Specifically, we introduce two neural network-based predictors to estimate the relationship between bitrate and $λ$, as well as the relationship between distortion and $λ$ for each frame. Then we determine the coding parameter $λ$ for each frame to achieve the target bitrate. Experimental results demonstrate that our approach achieves high rate control accuracy at the mini-GOP level with low time overhead and mitigates inter-frame quality fluctuations across video content of varying resolutions.
