Boosting Neural Video Representation via Online Structural Reparameterization

Ziyi Li; Qingyu Mao; Shuai Liu; Qilei Li; Fanyang Meng; Yongsheng Liang

Boosting Neural Video Representation via Online Structural Reparameterization

Ziyi Li, Qingyu Mao, Shuai Liu, Qilei Li, Fanyang Meng, Yongsheng Liang

TL;DR

The paper tackles the capacity bottleneck in Neural Video Representation (NVR) without increasing decoding cost. It introduces Online-RepNeRV, which employs an Enhanced Reparameterization Block (ERB) with multi-branch convolutions and an online parameter fusion strategy to boost training-time expressiveness; after training, branches are merged into a single kernel to maintain efficient inference. Empirical results show consistent PSNR/MS-SSIM gains over multiple baselines, especially in early training stages, and robust performance across datasets, with ablations highlighting the importance of branch design and online fusion. Overall, the work demonstrates that training-time architectural expansion can substantially improve NVR quality while preserving practical decoding efficiency and providing a flexible, plug-and-play approach for existing NVR pipelines.

Abstract

Neural Video Representation~(NVR) is a promising paradigm for video compression, showing great potential in improving video storage and transmission efficiency. While recent advances have made efforts in architectural refinements to improve representational capability, these methods typically involve complex designs, which may incur increased computational overhead and lack the flexibility to integrate into other frameworks. Moreover, the inherent limitation in model capacity restricts the expressiveness of NVR networks, resulting in a performance bottleneck. To overcome these limitations, we propose Online-RepNeRV, a NVR framework based on online structural reparameterization. Specifically, we propose a universal reparameterization block named ERB, which incorporates multiple parallel convolutional paths to enhance the model capacity. To mitigate the overhead, an online reparameterization strategy is adopted to dynamically fuse the parameters during training, and the multi-branch structure is equivalently converted into a single-branch structure after training. As a result, the additional computational and parameter complexity is confined to the encoding stage, without affecting the decoding efficiency. Extensive experiments on mainstream video datasets demonstrate that our method achieves an average PSNR gain of 0.37-2.7 dB over baseline methods, while maintaining comparable training time and decoding speed.

Boosting Neural Video Representation via Online Structural Reparameterization

TL;DR

Abstract

Boosting Neural Video Representation via Online Structural Reparameterization

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (7)