Table of Contents
Fetching ...

Subjective assessment of the impact of a content adaptive optimiser for compressing 4K HDR content with AV1

Vibhoothi, Angeliki Katsenou, François Pitié, Katarina Domijan, Anil Kokaram

TL;DR

This work investigates the subjective impact of per-clip content-adaptive λ optimization for AV1 on 4K HDR content, comparing perceptual scores with a range of objective metrics. It formulates a per-clip optimization via the RD objective $J = D + \lambda R$, where $\lambda \approx A \cdot q_{dc}^2$, and tunes clip-specific multipliers using Powell's method, reporting BD-rate improvements on HDR sequences. Through a DSCQS subjective study with 42 observers and seven 4K HDR clips, the study analyzes expert vs non-expert differences and correlates subjective scores with metrics like HDR-VDP-3 and VMAF, noting that film grain and ISO-noise influence judgments. Overall, the method yields modest perceptual gains (average MOS up by about $5.19\%$ with bitrate savings around $4.68\%$), with HDR-VDP-3 and VMAF providing the strongest correlations to subjective quality, and highlights the need for refined protocols to reduce variance in HDR subjective testing.

Abstract

Since 2015 video dimensionality has expanded to higher spatial and temporal resolutions and a wider colour gamut. This High Dynamic Range (HDR) content has gained traction in the consumer space as it delivers an enhanced quality of experience. At the same time, the complexity of codecs is growing. This has driven the development of tools for content-adaptive optimisation that achieve optimal rate-distortion performance for HDR video at 4K resolution. While improvements of just a few percentage points in BD-Rate (1-5\%) are significant for the streaming media industry, the impact on subjective quality has been less studied especially for HDR/AV1. In this paper, we conduct a subjective quality assessment (42 subjects) of 4K HDR content with a per-clip optimisation strategy. We correlate these subjective scores with existing popular objective metrics used in standard development and show that some perceptual metrics correlate surprisingly well even though they are not tuned for HDR. We find that the DSQCS protocol is too insensitive to categorically compare the methods but the data allows us to make recommendations about the use of experts vs non-experts in HDR studies, and explain the subjective impact of film grain in HDR content under compression.

Subjective assessment of the impact of a content adaptive optimiser for compressing 4K HDR content with AV1

TL;DR

This work investigates the subjective impact of per-clip content-adaptive λ optimization for AV1 on 4K HDR content, comparing perceptual scores with a range of objective metrics. It formulates a per-clip optimization via the RD objective , where , and tunes clip-specific multipliers using Powell's method, reporting BD-rate improvements on HDR sequences. Through a DSCQS subjective study with 42 observers and seven 4K HDR clips, the study analyzes expert vs non-expert differences and correlates subjective scores with metrics like HDR-VDP-3 and VMAF, noting that film grain and ISO-noise influence judgments. Overall, the method yields modest perceptual gains (average MOS up by about with bitrate savings around ), with HDR-VDP-3 and VMAF providing the strongest correlations to subjective quality, and highlights the need for refined protocols to reduce variance in HDR subjective testing.

Abstract

Since 2015 video dimensionality has expanded to higher spatial and temporal resolutions and a wider colour gamut. This High Dynamic Range (HDR) content has gained traction in the consumer space as it delivers an enhanced quality of experience. At the same time, the complexity of codecs is growing. This has driven the development of tools for content-adaptive optimisation that achieve optimal rate-distortion performance for HDR video at 4K resolution. While improvements of just a few percentage points in BD-Rate (1-5\%) are significant for the streaming media industry, the impact on subjective quality has been less studied especially for HDR/AV1. In this paper, we conduct a subjective quality assessment (42 subjects) of 4K HDR content with a per-clip optimisation strategy. We correlate these subjective scores with existing popular objective metrics used in standard development and show that some perceptual metrics correlate surprisingly well even though they are not tuned for HDR. We find that the DSQCS protocol is too insensitive to categorically compare the methods but the data allows us to make recommendations about the use of experts vs non-experts in HDR studies, and explain the subjective impact of film grain in HDR content under compression.
Paper Structure (13 sections, 3 figures, 1 table)

This paper contains 13 sections, 3 figures, 1 table.

Figures (3)

  • Figure 1: Sample frames of the video dataset (Sequences are tone mapped to BT.709 for representation)
  • Figure 2: Fig \ref{['fig:p913-mos-all-noctRoom']}-\ref{['fig:p913-mos-all-merid']}: P910-DMOS Score variation for all the 42 subjects who participated in the study, Fig \ref{['fig:p913-mos-expt-merid']}, \ref{['fig:p913-mos-noexpt-merid']}, \ref{['fig:p913-mos-vmaf']} denotes P910-DMOS scores of MeridianFace sequence for a group of experts (10) and non-experts (32), VMAF respectively.
  • Figure 3: Spearman correlation for opinion scores from 42 subjects after non-linear mapping 2006subobjmapping for the objective metrics after grouping by different cohorts and score recovery.