Table of Contents
Fetching ...

Implicit Neural Representation for Videos Based on Residual Connection

Taiga Hayami, Hiroshi Watanabe

TL;DR

The paper tackles video compression with implicit neural representations and seeks to improve frame fidelity beyond existing INR methods. It introduces a residual learning approach that injects a low-resolution frame as a residual, with the reconstruction built from the upsampled residual plus the network output. Evaluations on the DAVIS dataset show the method outperforms HNeRV in PSNR for 46 of 49 videos and in MS-SSIM for 39 of 49, at a modest bitrate cost. This demonstrates the value of residual connections for refining high-frequency content in INR-based video compression and points to future work on optimal resize scale and learning-rate settings.

Abstract

Video compression technology is essential for transmitting and storing videos. Many video compression methods reduce information in videos by removing high-frequency components and utilizing similarities between frames. Alternatively, the implicit neural representations (INRs) for videos, which use networks to represent and compress videos through model compression. A conventional method improves the quality of reconstruction by using frame features. However, the detailed representation of the frames can be improved. To improve the quality of reconstructed frames, we propose a method that uses low-resolution frames as residual connection that is considered effective for image reconstruction. Experimental results show that our method outperforms the existing method, HNeRV, in PSNR for 46 of the 49 videos.

Implicit Neural Representation for Videos Based on Residual Connection

TL;DR

The paper tackles video compression with implicit neural representations and seeks to improve frame fidelity beyond existing INR methods. It introduces a residual learning approach that injects a low-resolution frame as a residual, with the reconstruction built from the upsampled residual plus the network output. Evaluations on the DAVIS dataset show the method outperforms HNeRV in PSNR for 46 of 49 videos and in MS-SSIM for 39 of 49, at a modest bitrate cost. This demonstrates the value of residual connections for refining high-frequency content in INR-based video compression and points to future work on optimal resize scale and learning-rate settings.

Abstract

Video compression technology is essential for transmitting and storing videos. Many video compression methods reduce information in videos by removing high-frequency components and utilizing similarities between frames. Alternatively, the implicit neural representations (INRs) for videos, which use networks to represent and compress videos through model compression. A conventional method improves the quality of reconstruction by using frame features. However, the detailed representation of the frames can be improved. To improve the quality of reconstructed frames, we propose a method that uses low-resolution frames as residual connection that is considered effective for image reconstruction. Experimental results show that our method outperforms the existing method, HNeRV, in PSNR for 46 of the 49 videos.
Paper Structure (7 sections, 2 equations, 3 figures, 1 table)

This paper contains 7 sections, 2 equations, 3 figures, 1 table.

Figures (3)

  • Figure 1: The pipeline. (a) NeRV. (b) HNeRV. (c) Ours.
  • Figure 2: Visualization of reconstructed videos.
  • Figure 3: Video compression results on DAVIS dataset.