Table of Contents
Fetching ...

Predicting Encoding Energy from Low-Pass Anchors for Green Video Streaming

Zoha Azimi, Reza Farahani, Vignesh V Menon, Christian Timmerer

TL;DR

The paper tackles the energy cost of high-resolution video encoding in adaptive streaming by introducing a lightweight, anchor-based energy prediction approach. It uses low-resolution anchors to infer encoding/decoding energy and perceptual quality across a full resolution/QP ladder via ML models, enabling an energy-aware encoding selection under a quality constraint. On 100 Inter4K sequences with the VVenC/VVdeC stack, the approach achieves about 51% encoding and 54% decoding energy savings with only a 1.68-point drop in VMAF (within the JND range), demonstrating a practical path to greener video streaming. The method reduces measurement overhead, scales to large content catalogs, and supports tunable energy- QoE trade-offs for real-world HAS deployments.

Abstract

Video streaming now represents the dominant share of Internet traffic, as ever-higher-resolution content is distributed across a growing range of heterogeneous devices to sustain user Quality of Experience (QoE). However, this trend raises significant concerns about energy efficiency and carbon emissions, requiring methods to provide a trade-off between energy and QoE. This paper proposes a lightweight energy prediction method that estimates the energy consumption of high-resolution video encodings using reference encodings generated at lower resolutions (so-called anchors), eliminating the need for exhaustive per-segment energy measurements, a process that is infeasible at scale. We automatically select encoding parameters, such as resolution and quantization parameter (QP), to achieve substantial energy savings while maintaining perceptual quality, as measured by the Video Multimethod Fusion Assessment (VMAF), within acceptable limits. We implement and evaluate our approach with the open-source VVenC encoder on 100 video sequences from the Inter4K dataset across multiple encoding settings. Results show that, for an average VMAF score reduction of only 1.68, which stays below the Just Noticeable Difference (JND) threshold, our method achieves 51.22% encoding energy savings and 53.54% decoding energy savings compared to a scenario with no quality degradation.

Predicting Encoding Energy from Low-Pass Anchors for Green Video Streaming

TL;DR

The paper tackles the energy cost of high-resolution video encoding in adaptive streaming by introducing a lightweight, anchor-based energy prediction approach. It uses low-resolution anchors to infer encoding/decoding energy and perceptual quality across a full resolution/QP ladder via ML models, enabling an energy-aware encoding selection under a quality constraint. On 100 Inter4K sequences with the VVenC/VVdeC stack, the approach achieves about 51% encoding and 54% decoding energy savings with only a 1.68-point drop in VMAF (within the JND range), demonstrating a practical path to greener video streaming. The method reduces measurement overhead, scales to large content catalogs, and supports tunable energy- QoE trade-offs for real-world HAS deployments.

Abstract

Video streaming now represents the dominant share of Internet traffic, as ever-higher-resolution content is distributed across a growing range of heterogeneous devices to sustain user Quality of Experience (QoE). However, this trend raises significant concerns about energy efficiency and carbon emissions, requiring methods to provide a trade-off between energy and QoE. This paper proposes a lightweight energy prediction method that estimates the energy consumption of high-resolution video encodings using reference encodings generated at lower resolutions (so-called anchors), eliminating the need for exhaustive per-segment energy measurements, a process that is infeasible at scale. We automatically select encoding parameters, such as resolution and quantization parameter (QP), to achieve substantial energy savings while maintaining perceptual quality, as measured by the Video Multimethod Fusion Assessment (VMAF), within acceptable limits. We implement and evaluate our approach with the open-source VVenC encoder on 100 video sequences from the Inter4K dataset across multiple encoding settings. Results show that, for an average VMAF score reduction of only 1.68, which stays below the Just Noticeable Difference (JND) threshold, our method achieves 51.22% encoding energy savings and 53.54% decoding energy savings compared to a scenario with no quality degradation.

Paper Structure

This paper contains 17 sections, 1 equation, 8 figures, 4 tables, 1 algorithm.

Figures (8)

  • Figure 1: The correlation between encoding time and encoding energy for 100.0 video sequences, encoded with 720.0p/30.0fps and 2160.0p/60.0fps with three different number of threads.
  • Figure 2: Average correlation of encoding times across 100.0 video sequences for different resolutions and QPs. Each point shows the mean correlation of one configuration with all others, plotted against its average encoding time (log scale).
  • Figure 3: Proposed system architecture.
  • Figure 4: SOM-based clustering on the video complexity features on (a) the full dataset, and (b) our 100.0 subset.
  • Figure 5: Impact of resolution variations on encoding and decoding energy, bitrate, PSNR, and VMAF across 100.0 video sequences.
  • ...and 3 more figures