Exploring Real&Synthetic Dataset and Linear Attention in Image Restoration
Yuzhen Du, Teng Hu, Jiangning Zhang, Ran Yi Chengming Xu, Xiaobin Hu, Kai Wu, Donghao Luo, Yabiao Wang, Lizhuang Ma
TL;DR
The paper tackles IR challenges by exposing bias in training/testing image complexities and proposing ReSyn, a large-scale Real&Synthetic IR benchmark filtered by a GLCM-based complexity metric. It then introduces RWKV-IR, a linear-attention IR model that fuses global-local modeling via DC-Shift and Cross-Bi-WKV within a three-stage restore system, and establishes a unified training standard to enable fair, benchmarked comparisons. Extensive SR, denoising, and JPEG experiments show RWKV-IR achieving strong results and the ReSyn benchmark facilitating robust, cross-dataset evaluation. The work offers practical impact by enabling fairer model comparisons and advancing efficient, scalable IR models using linear attention.
Abstract
Image restoration (IR) aims to recover high-quality images from degraded inputs, with recent deep learning advancements significantly enhancing performance. However, existing methods lack a unified training benchmark for iterations and configurations. We also identify a bias in image complexity distributions between commonly used IR training and testing datasets, resulting in suboptimal restoration outcomes. To address this, we introduce a large-scale IR dataset called ReSyn, which employs a novel image filtering method based on image complexity to ensure a balanced distribution and includes both real and AIGC synthetic images. We establish a unified training standard that specifies iterations and configurations for image restoration models, focusing on measuring model convergence and restoration capability. Additionally, we enhance transformer-based image restoration models using linear attention mechanisms by proposing RWKV-IR, which integrates linear complexity RWKV into the transformer structure, allowing for both global and local receptive fields. Instead of directly using Vision-RWKV, we replace the original Q-Shift in RWKV with a Depth-wise Convolution shift to better model local dependencies, combined with Bi-directional attention for comprehensive linear attention. We also introduce a Cross-Bi-WKV module that merges two Bi-WKV modules with different scanning orders for balanced horizontal and vertical attention. Extensive experiments validate the effectiveness of our RWKV-IR model.
