Table of Contents
Fetching ...

Adaptive Rate Control for Deep Video Compression with Rate-Distortion Prediction

Bowen Gu, Hao Chen, Ming Lu, Jie Yao, Zhan Ma

TL;DR

This work introduces a one-pass neural network–based $\lambda$-domain rate control for deep video compression that learns per-frame $R$-$\lambda$ and $D$-$\lambda$ relationships directly from uncompressed frames without pre-encoding. A shared prediction module outputs per-frame RD-$\lambda$ samples, which are fitted to derive per-frame curves and guide a mini-GOP level rate-control optimization; a distortion addition mechanism bridges reference-frame mismatch, and inputs are down-sampled for efficiency. The proposed method achieves high rate-control accuracy with moderate time overhead and substantially reduces inter-frame quality fluctuations across resolutions and codecs. Experiments show favorable performance against multi-pass and one-pass baselines on diverse datasets and two deep codecs, highlighting practical gains for real-time or bandwidth-fluctuating streaming scenarios.

Abstract

Deep video compression has made significant progress in recent years, achieving rate-distortion performance that surpasses that of traditional video compression methods. However, rate control schemes tailored for deep video compression have not been well studied. In this paper, we propose a neural network-based $λ$-domain rate control scheme for deep video compression, which determines the coding parameter $λ$ for each to-be-coded frame based on the rate-distortion-$λ$ (R-D-$λ$) relationships directly learned from uncompressed frames, achieving high rate control accuracy efficiently without the need for pre-encoding. Moreover, this content-aware scheme is able to mitigate inter-frame quality fluctuations and adapt to abrupt changes in video content. Specifically, we introduce two neural network-based predictors to estimate the relationship between bitrate and $λ$, as well as the relationship between distortion and $λ$ for each frame. Then we determine the coding parameter $λ$ for each frame to achieve the target bitrate. Experimental results demonstrate that our approach achieves high rate control accuracy at the mini-GOP level with low time overhead and mitigates inter-frame quality fluctuations across video content of varying resolutions.

Adaptive Rate Control for Deep Video Compression with Rate-Distortion Prediction

TL;DR

This work introduces a one-pass neural network–based -domain rate control for deep video compression that learns per-frame - and - relationships directly from uncompressed frames without pre-encoding. A shared prediction module outputs per-frame RD- samples, which are fitted to derive per-frame curves and guide a mini-GOP level rate-control optimization; a distortion addition mechanism bridges reference-frame mismatch, and inputs are down-sampled for efficiency. The proposed method achieves high rate-control accuracy with moderate time overhead and substantially reduces inter-frame quality fluctuations across resolutions and codecs. Experiments show favorable performance against multi-pass and one-pass baselines on diverse datasets and two deep codecs, highlighting practical gains for real-time or bandwidth-fluctuating streaming scenarios.

Abstract

Deep video compression has made significant progress in recent years, achieving rate-distortion performance that surpasses that of traditional video compression methods. However, rate control schemes tailored for deep video compression have not been well studied. In this paper, we propose a neural network-based -domain rate control scheme for deep video compression, which determines the coding parameter for each to-be-coded frame based on the rate-distortion- (R-D-) relationships directly learned from uncompressed frames, achieving high rate control accuracy efficiently without the need for pre-encoding. Moreover, this content-aware scheme is able to mitigate inter-frame quality fluctuations and adapt to abrupt changes in video content. Specifically, we introduce two neural network-based predictors to estimate the relationship between bitrate and , as well as the relationship between distortion and for each frame. Then we determine the coding parameter for each frame to achieve the target bitrate. Experimental results demonstrate that our approach achieves high rate control accuracy at the mini-GOP level with low time overhead and mitigates inter-frame quality fluctuations across video content of varying resolutions.

Paper Structure

This paper contains 11 sections, 6 equations, 3 figures, 2 tables, 1 algorithm.

Figures (3)

  • Figure 1: The system framework of our rate control scheme.
  • Figure 2: Network structure of our prediction module.
  • Figure 3: Rate control results on HEVC class E FourPeople sequence. The target bpp is set as 0.075. "Baseline" denotes the multi-pass rate control method, and "Anchor" denotes the fixed $\lambda$ coding approach.