Table of Contents
Fetching ...

Bjøntegaard Delta (BD): A Tutorial Overview of the Metric, Evolution, Challenges, and Recommendations

Nabajeet Barman, Maria G. Martini, Yuriy Reznik

TL;DR

The paper tackles the reliability and usage of the Bjøntegaard Delta (BD) metric, a standard tool for comparing rate–distortion performance of video codecs. It provides a thorough tutorial on BD theory, traces its 20-year evolution, and evaluates multiple implementations under PSNR, SSIM, VMAF, and MOS, including subjective data. Through an experimental study using an open UHD dataset, it shows that cubic versus piecewise-cubic interpolation and the choice of distortion metric can drastically affect reported BD results, especially when RD curves are not well-behaved. The authors offer practical recommendations to improve interpretation, such as using piecewise-cubic fits, reporting BD-Rate alongside BD-Quality, and considering alternative operating ranges or network-aware averaging. The work highlights the need for careful metric design and suggests extensions for learning-based and network-aware scenarios to better reflect real-world codec performance and operating conditions.

Abstract

The Bjøntegaard Delta (BD) method proposed in 2001 has become a popular tool for comparing video codec compression efficiency. It was initially proposed to compute bitrate and quality differences between two Rate-Distortion curves using PSNR as a distortion metric. Over the years, many works have calculated and reported BD results using other objective quality metrics such as SSIM, VMAF and, in some cases, even subjective ratings (mean opinion scores). However, the lack of consolidated literature explaining the metric, its evolution over the years, and a systematic evaluation of the same under different test conditions can result in a wrong interpretation of the BD results thus obtained. Towards this end, this paper presents a detailed tutorial describing the BD method and example cases where the metric might fail. We also provide a detailed history of its evolution, including a discussion of various proposed improvements and variations over the last 20 years. In addition, we evaluate the various BD methods and their open-source implementations, considering different objective quality metrics and subjective ratings taking into account different RD characteristics. Based on our results, we present a set of recommendations on using existing BD metrics and various insights for possible exploration towards developing more effective tools for codec compression efficiency evaluation and comparison.

Bjøntegaard Delta (BD): A Tutorial Overview of the Metric, Evolution, Challenges, and Recommendations

TL;DR

The paper tackles the reliability and usage of the Bjøntegaard Delta (BD) metric, a standard tool for comparing rate–distortion performance of video codecs. It provides a thorough tutorial on BD theory, traces its 20-year evolution, and evaluates multiple implementations under PSNR, SSIM, VMAF, and MOS, including subjective data. Through an experimental study using an open UHD dataset, it shows that cubic versus piecewise-cubic interpolation and the choice of distortion metric can drastically affect reported BD results, especially when RD curves are not well-behaved. The authors offer practical recommendations to improve interpretation, such as using piecewise-cubic fits, reporting BD-Rate alongside BD-Quality, and considering alternative operating ranges or network-aware averaging. The work highlights the need for careful metric design and suggests extensions for learning-based and network-aware scenarios to better reflect real-world codec performance and operating conditions.

Abstract

The Bjøntegaard Delta (BD) method proposed in 2001 has become a popular tool for comparing video codec compression efficiency. It was initially proposed to compute bitrate and quality differences between two Rate-Distortion curves using PSNR as a distortion metric. Over the years, many works have calculated and reported BD results using other objective quality metrics such as SSIM, VMAF and, in some cases, even subjective ratings (mean opinion scores). However, the lack of consolidated literature explaining the metric, its evolution over the years, and a systematic evaluation of the same under different test conditions can result in a wrong interpretation of the BD results thus obtained. Towards this end, this paper presents a detailed tutorial describing the BD method and example cases where the metric might fail. We also provide a detailed history of its evolution, including a discussion of various proposed improvements and variations over the last 20 years. In addition, we evaluate the various BD methods and their open-source implementations, considering different objective quality metrics and subjective ratings taking into account different RD characteristics. Based on our results, we present a set of recommendations on using existing BD metrics and various insights for possible exploration towards developing more effective tools for codec compression efficiency evaluation and comparison.
Paper Structure (39 sections, 14 equations, 9 figures, 5 tables)

This paper contains 39 sections, 14 equations, 9 figures, 5 tables.

Figures (9)

  • Figure 1: Distortion-Rate performance comparison of two codecs, codec A and codec B. Red points show the measured (R,D) operating points for codec A. Blue points show the measured (R,D) operating points for codec B. The functions $D_A(R)$ and $D_B(R)$ show the results of interpolation across sample points and extrapolation beyond. These functions can be understood as approximations of Operational Rate-Distortion characteristics of codecs A and B, respectively.
  • Figure 2: Computation of BD-PSNR. $R_{min}$ and $R_{max}$ indicate the range of integration along bitrate, and the yellow region shows that the area of the integral: $A_D= \int_R [D_A(R)-D_B(R)] dR$. The average BD-PSNR value in dB is computed as BD-PSNR, $\bar{\Delta}_D$ = $A_D/(R_{max}-R_{min})$ [dB].
  • Figure 3: Computation of BD-Rate. $D_{min}$ and $D_{max}$ indicate the range of integration along distortion (PSNR), and the yellow region shows the area of the integral: $A_R= \int_D [R_A(D)-R_B(D)] dD$. The average BD-Rate value is then computed as $\bar{\Delta}_R = \frac{A_R}{(D_{max}-D_{min})},$
  • Figure 4: Example sample case with crossover RD curves when using BD metric may be confusing.
  • Figure 5: Illustration showing example cases with different overlap cases between the RD curves. (a) Well-behaved case with a good overlap between the RD curves. (b) Too "low" case with a small overlap area between the RD curves. (c) Too "high" case with too large of overlap between the RD curves than what the Quality-Bitrate values are obtained for Codec A.
  • ...and 4 more figures