Content-Driven Frame-Level Bit Prediction for Rate Control in Versatile Video Coding

Amritha Premkumar; Prajit T Rajendran; Vignesh V Menon; Christian Herglotz

Content-Driven Frame-Level Bit Prediction for Rate Control in Versatile Video Coding

Amritha Premkumar, Prajit T Rajendran, Vignesh V Menon, Christian Herglotz

TL;DR

The paper tackles rate control in VVC by replacing traditional analytic rate–QP models with a content-aware, VCA-feature–driven frame-level bit predictor implemented via Random Forest regression. It introduces frame-type–specific models (I, P, B) that leverage lightweight, multi-scale spatial–temporal features to predict per-frame bit consumption from the first pass, enabling a second pass to refine QP using VVenC's R–QP mapping. Empirical results on UHD sequences show strong predictive accuracy ($R^2$ up to 0.93 for I-frames) and competitive BD$_{YUV}$ performance ($-0.14 ext{%}$ on average) with a 33.3% reduction in total encoding time, highlighting improved stability and efficiency. The approach offers a practical path to real-time, energy-efficient encoding in production pipelines and adaptive streaming, without requiring trial encodes. ${R^2}$ values and BD$_{YUV}$ gains demonstrate that content-driven, lightweight features effectively capture bitrate-driving complexity in modern encoders.

Abstract

Rate control allocates bits efficiently across frames to meet a target bitrate while maintaining quality. Conventional two-pass rate control (2pRC) in Versatile Video Coding (VVC) relies on analytical rate-QP models, which often fail to capture nonlinear spatial-temporal variations, causing quality instability and high complexity due to multiple trial encodes. This paper proposes a content-adaptive framework that predicts frame-level bit consumption using lightweight features from the Video Complexity Analyzer (VCA) and quantization parameters within a Random Forest regression. On ultra-high-definition sequences encoded with VVenC, the model achieves strong correlation with ground truth, yielding R2 values of 0.93, 0.88, and 0.77 for I-, P-, and B-frames, respectively. Integrated into a rate-control loop, it achieves comparable coding efficiency to 2pRC while reducing total encoding time by 33.3%. The results show that VCA-driven bit prediction provides a computationally efficient and accurate alternative to conventional rate-QP models.

Content-Driven Frame-Level Bit Prediction for Rate Control in Versatile Video Coding

TL;DR

up to 0.93 for I-frames) and competitive BD

performance (

on average) with a 33.3% reduction in total encoding time, highlighting improved stability and efficiency. The approach offers a practical path to real-time, energy-efficient encoding in production pipelines and adaptive streaming, without requiring trial encodes.

values and BD

gains demonstrate that content-driven, lightweight features effectively capture bitrate-driving complexity in modern encoders.

Abstract

Paper Structure (24 sections, 5 equations, 3 figures, 2 tables)

This paper contains 24 sections, 5 equations, 3 figures, 2 tables.

Introduction
Proposed Two-Pass Rate Control Framework
Video Complexity Feature Extraction
Regression Models
I-frame
P-frame
B-frame
Second Pass
Evaluation Setup
First Pass
Dataset
Feature Extraction
Learning Framework
Performance Metrics
Second Pass
...and 9 more sections

Figures (3)

Figure 1: Overview of the proposed two-pass rate control framework integrating VCA-driven bit prediction into VVenC. The first pass extracts content features, predicts frame-level bits, and refines QP assignment for the second-pass encoding.
Figure 2: Hierarchical GOP-32 structure showing inter-frame prediction dependencies (0- I-frame, 32- P-frame, others- B-frames).
Figure 3: Relative importance of features in bits prediction.

Content-Driven Frame-Level Bit Prediction for Rate Control in Versatile Video Coding

TL;DR

Abstract

Content-Driven Frame-Level Bit Prediction for Rate Control in Versatile Video Coding

Authors

TL;DR

Abstract

Table of Contents

Figures (3)