SSNVC: Single Stream Neural Video Compression with Implicit Temporal Information

Feng Wang; Haihang Ruan; Zhihuang Xie; Ronggang Wang; Xiangyu Yue

SSNVC: Single Stream Neural Video Compression with Implicit Temporal Information

Feng Wang, Haihang Ruan, Zhihuang Xie, Ronggang Wang, Xiangyu Yue

TL;DR

This paper proposes Single Stream Neural Video Compression, SS-NVC, which implicitly utilizes temporal information to eliminate temporal redundancy in video sequence and can greatly simplify training and compression process of NVC.

Abstract

Recently, Neural Video Compression (NVC) techniques have achieved remarkable performance, even surpassing the best traditional lossy video codec. However, most existing NVC methods heavily rely on transmitting Motion Vector (MV) to generate accurate contextual features, which has the following drawbacks. (1) Compressing and transmitting MV requires specialized MV encoder and decoder, which makes modules redundant. (2) Due to the existence of MV Encoder-Decoder, the training strategy is complex. In this paper, we present a noval Single Stream NVC framework (SSNVC), which removes complex MV Encoder-Decoder structure and uses a one-stage training strategy. SSNVC implicitly use temporal information by adding previous entropy model feature to current entropy model and using previous two frame to generate predicted motion information at the decoder side. Besides, we enhance the frame generator to generate higher quality reconstructed frame. Experiments demonstrate that SSNVC can achieve state-of-the-art performance on multiple benchmarks, and can greatly simplify compression process as well as training process.

SSNVC: Single Stream Neural Video Compression with Implicit Temporal Information

TL;DR

Abstract

Paper Structure (11 sections, 2 equations, 4 figures, 4 tables)

This paper contains 11 sections, 2 equations, 4 figures, 4 tables.

1 Introduction
2 Analysis of Neural Video Compression
3 Method
4 Experiments
5 Conclusion
References

Figures (4)

Figure 1: Training strategy of different neural video compression models. Each color block represents a training stage tcmvct.
Figure 2: Overview of our proposed video compression scheme. The red solid lines are only used at the encoder side. The blue solid lines are only used at decoder side.
Figure 3: Left: Structure of Dense U-Net module. ⓒ stands for concat. Right: Visualization of the features before the last convolutional layer of frame generator. Pictures from HEVC Class C.
Figure 4: Rate-distortion performance of SSNVC on the HEVC Class C, D and E datasets.

SSNVC: Single Stream Neural Video Compression with Implicit Temporal Information

TL;DR

Abstract

SSNVC: Single Stream Neural Video Compression with Implicit Temporal Information

Authors

TL;DR

Abstract

Table of Contents

Figures (4)