Predicting Encoding Energy from Low-Pass Anchors for Green Video Streaming
Zoha Azimi, Reza Farahani, Vignesh V Menon, Christian Timmerer
TL;DR
The paper tackles the energy cost of high-resolution video encoding in adaptive streaming by introducing a lightweight, anchor-based energy prediction approach. It uses low-resolution anchors to infer encoding/decoding energy and perceptual quality across a full resolution/QP ladder via ML models, enabling an energy-aware encoding selection under a quality constraint. On 100 Inter4K sequences with the VVenC/VVdeC stack, the approach achieves about 51% encoding and 54% decoding energy savings with only a 1.68-point drop in VMAF (within the JND range), demonstrating a practical path to greener video streaming. The method reduces measurement overhead, scales to large content catalogs, and supports tunable energy- QoE trade-offs for real-world HAS deployments.
Abstract
Video streaming now represents the dominant share of Internet traffic, as ever-higher-resolution content is distributed across a growing range of heterogeneous devices to sustain user Quality of Experience (QoE). However, this trend raises significant concerns about energy efficiency and carbon emissions, requiring methods to provide a trade-off between energy and QoE. This paper proposes a lightweight energy prediction method that estimates the energy consumption of high-resolution video encodings using reference encodings generated at lower resolutions (so-called anchors), eliminating the need for exhaustive per-segment energy measurements, a process that is infeasible at scale. We automatically select encoding parameters, such as resolution and quantization parameter (QP), to achieve substantial energy savings while maintaining perceptual quality, as measured by the Video Multimethod Fusion Assessment (VMAF), within acceptable limits. We implement and evaluate our approach with the open-source VVenC encoder on 100 video sequences from the Inter4K dataset across multiple encoding settings. Results show that, for an average VMAF score reduction of only 1.68, which stays below the Just Noticeable Difference (JND) threshold, our method achieves 51.22% encoding energy savings and 53.54% decoding energy savings compared to a scenario with no quality degradation.
