RFSR: Improving ISR Diffusion Models via Reward Feedback Learning

Xiaopeng Sun; Qinwei Lin; Yu Gao; Yujie Zhong; Chengjian Feng; Dengjie Li; Zheng Zhao; Jie Hu; Lin Ma

RFSR: Improving ISR Diffusion Models via Reward Feedback Learning

Xiaopeng Sun, Qinwei Lin, Yu Gao, Yujie Zhong, Chengjian Feng, Dengjie Li, Zheng Zhao, Jie Hu, Lin Ma

TL;DR

RFSR addresses the challenge of improving image super-resolution diffusion models by integrating reward feedback learning into a timestep-aware training regime. It employs a low-frequency structure constraint in early denoising steps and reward-driven optimization in later steps, augmented by Gram-KL regularization to curb reward hacking. The approach defines and schedules multiple loss terms, including $\mathcal{L}_{dwt_{ll}}$, $\mathcal{L}_{reward}$, and $\mathcal{L}_{gram-kl}$, across time steps, and demonstrates substantial gains in perceptual and aesthetic metrics on synthetic and real-world ISR benchmarks. The method is plug-and-play for existing diffusion-based ISR models and offers a practical path to higher-quality SR results, while noting limitations related to reliance on pre-trained diffusion backbones and reward models.

Abstract

Generative diffusion models (DM) have been extensively utilized in image super-resolution (ISR). Most of the existing methods adopt the denoising loss from DDPMs for model optimization. We posit that introducing reward feedback learning to finetune the existing models can further improve the quality of the generated images. In this paper, we propose a timestep-aware training strategy with reward feedback learning. Specifically, in the initial denoising stages of ISR diffusion, we apply low-frequency constraints to super-resolution (SR) images to maintain structural stability. In the later denoising stages, we use reward feedback learning to improve the perceptual and aesthetic quality of the SR images. In addition, we incorporate Gram-KL regularization to alleviate stylization caused by reward hacking. Our method can be integrated into any diffusion-based ISR model in a plug-and-play manner. Experiments show that ISR diffusion models, when fine-tuned with our method, significantly improve the perceptual and aesthetic quality of SR images, achieving excellent subjective results. Code: https://github.com/sxpro/RFSR

RFSR: Improving ISR Diffusion Models via Reward Feedback Learning

TL;DR

Abstract

RFSR: Improving ISR Diffusion Models via Reward Feedback Learning

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (8)