Table of Contents
Fetching ...

RepNet-VSR: Reparameterizable Architecture for High-Fidelity Video Super-Resolution

Biao Wu, Diankai Zhang, Shaoli Liu, Si Gao, Chengjian Zheng, Ning Wang

TL;DR

This paper addresses real-time, on-device video super-resolution by introducing RepNet-VSR, a reparameterizable architecture that combines multi-level feature fusion with training-time complexity and deployment-time efficiency. A NAS-driven framework optimizes the balance between reconstruction fidelity and FLOPs, aided by channel-wise 1×1 reductions and a reparameterization block that merges training-time depth and breadth into an efficient single path at inference. On REDS, RepNet-VSR achieves 27.79 dB PSNR for 4× upscaling (180p→720p) at 103 ms per 10 frames on a MediaTek NPU, exceeding prior MAI2025 competition methods in the fidelity-efficiency tradeoff. The work highlights a practical pathway for deploying high-quality VSR on resource-constrained devices, enabling real-time mobile video enhancement with strong restoration and perceptual quality outcomes, validated by both quantitative metrics and qualitative visual comparisons.

Abstract

As a fundamental challenge in visual computing, video super-resolution (VSR) focuses on reconstructing highdefinition video sequences from their degraded lowresolution counterparts. While deep convolutional neural networks have demonstrated state-of-the-art performance in spatial-temporal super-resolution tasks, their computationally intensive nature poses significant deployment challenges for resource-constrained edge devices, particularly in real-time mobile video processing scenarios where power efficiency and latency constraints coexist. In this work, we propose a Reparameterizable Architecture for High Fidelity Video Super Resolution method, named RepNet-VSR, for real-time 4x video super-resolution. On the REDS validation set, the proposed model achieves 27.79 dB PSNR when processing 180p to 720p frames in 103 ms per 10 frames on a MediaTek Dimensity NPU. The competition results demonstrate an excellent balance between restoration quality and deployment efficiency. The proposed method scores higher than the previous champion algorithm of MAI video super-resolution challenge.

RepNet-VSR: Reparameterizable Architecture for High-Fidelity Video Super-Resolution

TL;DR

This paper addresses real-time, on-device video super-resolution by introducing RepNet-VSR, a reparameterizable architecture that combines multi-level feature fusion with training-time complexity and deployment-time efficiency. A NAS-driven framework optimizes the balance between reconstruction fidelity and FLOPs, aided by channel-wise 1×1 reductions and a reparameterization block that merges training-time depth and breadth into an efficient single path at inference. On REDS, RepNet-VSR achieves 27.79 dB PSNR for 4× upscaling (180p→720p) at 103 ms per 10 frames on a MediaTek NPU, exceeding prior MAI2025 competition methods in the fidelity-efficiency tradeoff. The work highlights a practical pathway for deploying high-quality VSR on resource-constrained devices, enabling real-time mobile video enhancement with strong restoration and perceptual quality outcomes, validated by both quantitative metrics and qualitative visual comparisons.

Abstract

As a fundamental challenge in visual computing, video super-resolution (VSR) focuses on reconstructing highdefinition video sequences from their degraded lowresolution counterparts. While deep convolutional neural networks have demonstrated state-of-the-art performance in spatial-temporal super-resolution tasks, their computationally intensive nature poses significant deployment challenges for resource-constrained edge devices, particularly in real-time mobile video processing scenarios where power efficiency and latency constraints coexist. In this work, we propose a Reparameterizable Architecture for High Fidelity Video Super Resolution method, named RepNet-VSR, for real-time 4x video super-resolution. On the REDS validation set, the proposed model achieves 27.79 dB PSNR when processing 180p to 720p frames in 103 ms per 10 frames on a MediaTek Dimensity NPU. The competition results demonstrate an excellent balance between restoration quality and deployment efficiency. The proposed method scores higher than the previous champion algorithm of MAI video super-resolution challenge.

Paper Structure

This paper contains 17 sections, 2 equations, 3 figures, 3 tables.

Figures (3)

  • Figure 1: RepNet-VSR architecture overview.
  • Figure 2: RepConv overview.
  • Figure 3: Qualitative comparison on the REDS val datasets. Zoom in for better visualization.