OmniScaleSR: Unleashing Scale-Controlled Diffusion Prior for Faithful and Realistic Arbitrary-Scale Image Super-Resolution

Xinning Chai; Zhengxue Cheng; Yuhong Zhang; Hengsheng Zhang; Yingsheng Qin; Yucai Yang; Rong Xie; Li Song

OmniScaleSR: Unleashing Scale-Controlled Diffusion Prior for Faithful and Realistic Arbitrary-Scale Image Super-Resolution

Xinning Chai, Zhengxue Cheng, Yuhong Zhang, Hengsheng Zhang, Yingsheng Qin, Yucai Yang, Rong Xie, Li Song

TL;DR

OmniScaleSR tackles the challenge of faithful and realistic arbitrary-scale image super-resolution by introducing explicit diffusion-native scale controls—global scale injection and local scale modulation—combined with implicit diffusion prior adaptation. The method employs a two-branch latent diffusion framework with multi-domain fidelity enhancements, including pixel-, pixel-to-latent-, and latent-space guidance via dual semantic prompts and SePR attention. Comprehensive experiments on bicubic and real-world degradations demonstrate superior fidelity and realism, especially at ultra-high scales, compared with state-of-the-art diffusion-based and INR-based ASSR methods. Limitations include longer inference time and potential semantic bias from prompts, suggesting directions for acceleration and robust prompt handling. Overall, OmniScaleSR provides a scalable approach to Real-ASSR that maintains high-quality reconstructions across diverse scales and degradation types.

Abstract

Arbitrary-scale super-resolution (ASSR) overcomes the limitation of traditional super-resolution (SR) methods that operate only at fixed scales (e.g., 4x), enabling a single model to handle arbitrary magnification. Most existing ASSR approaches rely on implicit neural representation (INR), but its regression-driven feature extraction and aggregation intrinsically limit the ability to synthesize fine details, leading to low realism. Recent diffusion-based realistic image super-resolution (Real-ISR) models leverage powerful pre-trained diffusion priors and show impressive results at the 4x setting. We observe that they can also achieve ASSR because the diffusion prior implicitly adapts to scale by encouraging high-realism generation. However, without explicit scale control, the diffusion process cannot be properly adjusted for different magnification levels, resulting in excessive hallucination or blurry outputs, especially under ultra-high scales. To address these issues, we propose OmniScaleSR, a diffusion-based realistic arbitrary-scale SR framework designed to achieve both high fidelity and high realism. We introduce explicit, diffusion-native scale control mechanisms that work synergistically with implicit scale adaptation, enabling scale-aware and content-aware modulation of the diffusion process. In addition, we incorporate multi-domain fidelity enhancement designs to further improve reconstruction accuracy. Extensive experiments on bicubic degradation benchmarks and real-world datasets show that OmniScaleSR surpasses state-of-the-art methods in both fidelity and perceptual realism, with particularly strong performance at large magnification factors. Code will be released at https://github.com/chaixinning/OmniScaleSR.

OmniScaleSR: Unleashing Scale-Controlled Diffusion Prior for Faithful and Realistic Arbitrary-Scale Image Super-Resolution

TL;DR

Abstract

OmniScaleSR: Unleashing Scale-Controlled Diffusion Prior for Faithful and Realistic Arbitrary-Scale Image Super-Resolution

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (6)