Table of Contents
Fetching ...

L-LBVC: Long-Term Motion Estimation and Prediction for Learned Bi-Directional Video Compression

Yongqi Zhai, Luyang Tang, Wei Jiang, Jiayu Yang, Ronggang Wang

TL;DR

Experiments show that the proposed L-LBVC significantly outperforms previous state-of-the-art LVC methods and even surpasses VVC (VTM) on some test datasets under random access configuration.

Abstract

Recently, learned video compression (LVC) has shown superior performance under low-delay configuration. However, the performance of learned bi-directional video compression (LBVC) still lags behind traditional bi-directional coding. The performance gap mainly arises from inaccurate long-term motion estimation and prediction of distant frames, especially in large motion scenes. To solve these two critical problems, this paper proposes a novel LBVC framework, namely L-LBVC. Firstly, we propose an adaptive motion estimation module that can handle both short-term and long-term motions. Specifically, we directly estimate the optical flows for adjacent frames and non-adjacent frames with small motions. For non-adjacent frames with large motions, we recursively accumulate local flows between adjacent frames to estimate long-term flows. Secondly, we propose an adaptive motion prediction module that can largely reduce the bit cost for motion coding. To improve the accuracy of long-term motion prediction, we adaptively downsample reference frames during testing to match the motion ranges observed during training. Experiments show that our L-LBVC significantly outperforms previous state-of-the-art LVC methods and even surpasses VVC (VTM) on some test datasets under random access configuration.

L-LBVC: Long-Term Motion Estimation and Prediction for Learned Bi-Directional Video Compression

TL;DR

Experiments show that the proposed L-LBVC significantly outperforms previous state-of-the-art LVC methods and even surpasses VVC (VTM) on some test datasets under random access configuration.

Abstract

Recently, learned video compression (LVC) has shown superior performance under low-delay configuration. However, the performance of learned bi-directional video compression (LBVC) still lags behind traditional bi-directional coding. The performance gap mainly arises from inaccurate long-term motion estimation and prediction of distant frames, especially in large motion scenes. To solve these two critical problems, this paper proposes a novel LBVC framework, namely L-LBVC. Firstly, we propose an adaptive motion estimation module that can handle both short-term and long-term motions. Specifically, we directly estimate the optical flows for adjacent frames and non-adjacent frames with small motions. For non-adjacent frames with large motions, we recursively accumulate local flows between adjacent frames to estimate long-term flows. Secondly, we propose an adaptive motion prediction module that can largely reduce the bit cost for motion coding. To improve the accuracy of long-term motion prediction, we adaptively downsample reference frames during testing to match the motion ranges observed during training. Experiments show that our L-LBVC significantly outperforms previous state-of-the-art LVC methods and even surpasses VVC (VTM) on some test datasets under random access configuration.

Paper Structure

This paper contains 15 sections, 6 equations, 6 figures, 9 tables.

Figures (6)

  • Figure 1: (a) Overview of the proposed L-LBVC framework. (b) Illustration of the optical flow codec. (c) Illustration of the B-frame codec.
  • Figure 2: Comparisons of the estimated optical flows and temporal predictions of our L-LBVC with B-CANF chen2023b on Jockey dataset.
  • Figure 3: Illustration of the proposed accumulation-based optical flow estimation method. (a) Example of the optical flow accumulation process. (b) The detailed network structure.
  • Figure 4: Visualization of the predicted optical flows using different downsampling factors.
  • Figure 5: RD-curves on the HEVC Class B, D and E datasets.
  • ...and 1 more figures