Exploring Real-Time Super-Resolution: Benchmarking and Fine-Tuning for Streaming Content
Evgeney Bogatyrev, Khaled Abud, Ivan Molodetskikh, Nikita Alutis, Dmitry Vatolin
TL;DR
This work tackles real-time super-resolution for heavily compressed streaming video by introducing the StreamSR benchmark and EfRLFN, an efficient SR model. StreamSR provides a large-scale, diverse set of YouTube-derived LR-HR pairs with real-world compression artifacts, enabling realistic benchmarking of 11 real-time SR models. EfRLFN combines ERLFB blocks, tanh-based refinement, and Efficient Channel Attention with a composite Charbonnier–VGG–Sobel loss to deliver superior quality and speed, outperforming contemporaries in both objective metrics and subjective user studies. The study also shows that fine-tuning existing models on StreamSR yields broad performance gains across standard benchmarks, underscoring the value of dataset-aligned training for real-time SR deployment.
Abstract
Recent advancements in real-time super-resolution have enabled higher-quality video streaming, yet existing methods struggle with the unique challenges of compressed video content. Commonly used datasets do not accurately reflect the characteristics of streaming media, limiting the relevance of current benchmarks. To address this gap, we introduce a comprehensive dataset - StreamSR - sourced from YouTube, covering a wide range of video genres and resolutions representative of real-world streaming scenarios. We benchmark 11 state-of-the-art real-time super-resolution models to evaluate their performance for the streaming use-case. Furthermore, we propose EfRLFN, an efficient real-time model that integrates Efficient Channel Attention and a hyperbolic tangent activation function - a novel design choice in the context of real-time super-resolution. We extensively optimized the architecture to maximize efficiency and designed a composite loss function that improves training convergence. EfRLFN combines the strengths of existing architectures while improving both visual quality and runtime performance. Finally, we show that fine-tuning other models on our dataset results in significant performance gains that generalize well across various standard benchmarks. We made the dataset, the code, and the benchmark available at https://github.com/EvgeneyBogatyrev/EfRLFN.
