Table of Contents
Fetching ...

SRNeRV: A Scale-wise Recursive Framework for Neural Video Representation

Jia Wang, Jun Zhu, Xinfeng Zhang

TL;DR

SRNeRV is a novel scale-wise recursive framework that replaces this stacked design with a parameter-efficient shared architecture, and achieves a significant rate-distortion performance boost, validating that the sharing scheme successfully amplifies the core strengths of the INR paradigm.

Abstract

Implicit Neural Representations (INRs) have emerged as a promising paradigm for video representation and compression. However, existing multi-scale INR generators often suffer from significant parameter redundancy by stacking independent processing blocks for each scale. Inspired by the principle of scale self-similarity in the generation process, we propose SRNeRV, a novel scale-wise recursive framework that replaces this stacked design with a parameter-efficient shared architecture. The core of our approach is a hybrid sharing scheme derived from decoupling the processing block into a scale-specific spatial mixing module and a scale-invariant channel mixing module. We recursively apply the same shared channel mixing module, which contains the majority of the parameters, across all scales, significantly reducing the model size while preserving the crucial capacity to learn scale-specific spatial patterns. Extensive experiments demonstrate that SRNeRV achieves a significant rate-distortion performance boost, especially in INR-friendly scenarios, validating that our sharing scheme successfully amplifies the core strengths of the INR paradigm.

SRNeRV: A Scale-wise Recursive Framework for Neural Video Representation

TL;DR

SRNeRV is a novel scale-wise recursive framework that replaces this stacked design with a parameter-efficient shared architecture, and achieves a significant rate-distortion performance boost, validating that the sharing scheme successfully amplifies the core strengths of the INR paradigm.

Abstract

Implicit Neural Representations (INRs) have emerged as a promising paradigm for video representation and compression. However, existing multi-scale INR generators often suffer from significant parameter redundancy by stacking independent processing blocks for each scale. Inspired by the principle of scale self-similarity in the generation process, we propose SRNeRV, a novel scale-wise recursive framework that replaces this stacked design with a parameter-efficient shared architecture. The core of our approach is a hybrid sharing scheme derived from decoupling the processing block into a scale-specific spatial mixing module and a scale-invariant channel mixing module. We recursively apply the same shared channel mixing module, which contains the majority of the parameters, across all scales, significantly reducing the model size while preserving the crucial capacity to learn scale-specific spatial patterns. Extensive experiments demonstrate that SRNeRV achieves a significant rate-distortion performance boost, especially in INR-friendly scenarios, validating that our sharing scheme successfully amplifies the core strengths of the INR paradigm.
Paper Structure (10 sections, 5 equations, 3 figures, 1 table, 1 algorithm)

This paper contains 10 sections, 5 equations, 3 figures, 1 table, 1 algorithm.

Figures (3)

  • Figure 1: Self-similarity in the multi-scale feature and corresponding generation process.
  • Figure 2: Overview of the SRNeRV architecture. (Left) The macro-architecture progressively generates high-scale frames by recursively applying shared SRNeRV-Blocks. (Center) The block's micro-architecture. (Right) Our hybrid sharing design is motivated by the distinct functions and parameter distribution of its components: the channel mixing Module contains most of parameters and performs a scale-invariant task, thus is the part shared across all scales.
  • Figure 3: Rate Distortion (RD) performance on different datasets.