Empowering Image Recovery_ A Multi-Attention Approach
Juan Wen, Yawei Li, Chao Zhang, Weiyan Hou, Radu Timofte, Luc Van Gool
TL;DR
The paper tackles the challenge of high-quality image restoration across diverse tasks by enabling a model to systematically integrate information from long sequences, local and global contexts, and multiple feature and positional dimensions. It introduces Diverse Restormer (DART), a multi-attention transformer built on a SwinIR-like backbone that combines LongIR attention for long-range dependencies with Feature Dimension Attention and Position Dimension Attention to refine information across channels and spatial dimensions. The approach demonstrates state-of-the-art performance across five restoration tasks while maintaining compact model sizes (e.g., DART-B with ~4.5M parameters) and shows substantial efficiency gains over competing methods. Ablation studies confirm the contributions of LongIR and the dimension-wise attentions, and experiments on real and synthetic data underscore DART’s robustness and practical impact for scalable, high-fidelity image recovery.
Abstract
We propose Diverse Restormer (DART), a novel image restoration method that effectively integrates information from various sources (long sequences, local and global regions, feature dimensions, and positional dimensions) to address restoration challenges. While Transformer models have demonstrated excellent performance in image restoration due to their self-attention mechanism, they face limitations in complex scenarios. Leveraging recent advancements in Transformers and various attention mechanisms, our method utilizes customized attention mechanisms to enhance overall performance. DART, our novel network architecture, employs windowed attention to mimic the selective focusing mechanism of human eyes. By dynamically adjusting receptive fields, it optimally captures the fundamental features crucial for image resolution reconstruction. Efficiency and performance balance are achieved through the LongIR attention mechanism for long sequence image restoration. Integration of attention mechanisms across feature and positional dimensions further enhances the recovery of fine details. Evaluation across five restoration tasks consistently positions DART at the forefront. Upon acceptance, we commit to providing publicly accessible code and models to ensure reproducibility and facilitate further research.
