Low-Res Leads the Way: Improving Generalization for Super-Resolution by Self-Supervised Learning
Haoyu Chen, Wenbo Li, Jinjin Gu, Jingjing Ren, Haoze Sun, Xueyi Zou, Zhensong Zhang, Youliang Yan, Lei Zhu
TL;DR
This work presents Low-Res Leads the Way (LWay), a practical SR framework that blends supervised pre-training on synthetic degradations with self-supervised fine-tuning on unseen real-world test images. It introduces an LR reconstruction network to extract a degradation embedding and uses a DWT-derived high-frequency weighting to guide HF detail restoration, enabling adaptation without architecture changes. Across RealSR, DRealSR, and multiple SR backbones (CNNs, Transformers, VQGAN, diffusion), LWay yields consistent gains in perceptual and fidelity metrics, demonstrating improved generalization to real-world degradations with fast test-time optimization. The approach reduces reliance on paired real data, maintains compatibility across models, and offers a scalable, deployment-friendly path toward robust real-world SR.
Abstract
For image super-resolution (SR), bridging the gap between the performance on synthetic datasets and real-world degradation scenarios remains a challenge. This work introduces a novel "Low-Res Leads the Way" (LWay) training framework, merging Supervised Pre-training with Self-supervised Learning to enhance the adaptability of SR models to real-world images. Our approach utilizes a low-resolution (LR) reconstruction network to extract degradation embeddings from LR images, merging them with super-resolved outputs for LR reconstruction. Leveraging unseen LR images for self-supervised learning guides the model to adapt its modeling space to the target domain, facilitating fine-tuning of SR models without requiring paired high-resolution (HR) images. The integration of Discrete Wavelet Transform (DWT) further refines the focus on high-frequency details. Extensive evaluations show that our method significantly improves the generalization and detail restoration capabilities of SR models on unseen real-world datasets, outperforming existing methods. Our training regime is universally compatible, requiring no network architecture modifications, making it a practical solution for real-world SR applications.
