Table of Contents
Fetching ...

Fractal-IR: A Unified Framework for Efficient and Scalable Image Restoration

Yawei Li, Bin Ren, Jingyun Liang, Rakesh Ranjan, Mengyuan Liu, Nicu Sebe, Ming-Hsuan Yang, Luca Benini

TL;DR

Fractal-IR introduces a fractal information flow to image restoration, enabling progressive global context aggregation without dense global self-attention. By organizing computation into three hierarchical levels (local $\mathscr{L}_1$, non-local $\mathscr{L}_2$, and global $\mathscr{L}_3$) and stacking Fractal-IR layers with residual connections, the framework achieves strong performance across seven IR tasks while improving efficiency. The authors address scaling challenges with a triad of strategies (warmup, lightweight convolutions, and dot-product attention) and demonstrate superior results on SR, denoising, JPEG CAR, adverse weather, motion deblurring, defocus deblurring, and demosaicking, including notable gains on Manga109 and Urban100. The work also includes extensive ablations and comparisons with ShuffleFormer/Shuffle Transformer, highlighting improved information flow, parameter efficiency, and practical scalability for real-world restoration tasks.

Abstract

While vision transformers achieve significant breakthroughs in various image restoration (IR) tasks, it is still challenging to efficiently scale them across multiple types of degradations and resolutions. In this paper, we propose Fractal-IR, a fractal-based design that progressively refines degraded images by repeatedly expanding local information into broader regions. This fractal architecture naturally captures local details at early stages and seamlessly transitions toward global context in deeper fractal stages, removing the need for computationally heavy long-range self-attention mechanisms. Moveover, we observe the challenge in scaling up vision transformers for IR tasks. Through a series of analyses, we identify a holistic set of strategies to effectively guide model scaling. Extensive experimental results show that Fractal-IR achieves state-of-the-art performance in seven common image restoration tasks, including super-resolution, denoising, JPEG artifact removal, IR in adverse weather conditions, motion deblurring, defocus deblurring, and demosaicking. For $2\times$ SR on Manga109, Fractal-IR achieves a 0.21 dB PSNR gain. For grayscale image denoising on Urban100, Fractal-IR surpasses the previous method by 0.2 dB for $σ=50$.

Fractal-IR: A Unified Framework for Efficient and Scalable Image Restoration

TL;DR

Fractal-IR introduces a fractal information flow to image restoration, enabling progressive global context aggregation without dense global self-attention. By organizing computation into three hierarchical levels (local , non-local , and global ) and stacking Fractal-IR layers with residual connections, the framework achieves strong performance across seven IR tasks while improving efficiency. The authors address scaling challenges with a triad of strategies (warmup, lightweight convolutions, and dot-product attention) and demonstrate superior results on SR, denoising, JPEG CAR, adverse weather, motion deblurring, defocus deblurring, and demosaicking, including notable gains on Manga109 and Urban100. The work also includes extensive ablations and comparisons with ShuffleFormer/Shuffle Transformer, highlighting improved information flow, parameter efficiency, and practical scalability for real-world restoration tasks.

Abstract

While vision transformers achieve significant breakthroughs in various image restoration (IR) tasks, it is still challenging to efficiently scale them across multiple types of degradations and resolutions. In this paper, we propose Fractal-IR, a fractal-based design that progressively refines degraded images by repeatedly expanding local information into broader regions. This fractal architecture naturally captures local details at early stages and seamlessly transitions toward global context in deeper fractal stages, removing the need for computationally heavy long-range self-attention mechanisms. Moveover, we observe the challenge in scaling up vision transformers for IR tasks. Through a series of analyses, we identify a holistic set of strategies to effectively guide model scaling. Extensive experimental results show that Fractal-IR achieves state-of-the-art performance in seven common image restoration tasks, including super-resolution, denoising, JPEG artifact removal, IR in adverse weather conditions, motion deblurring, defocus deblurring, and demosaicking. For SR on Manga109, Fractal-IR achieves a 0.21 dB PSNR gain. For grayscale image denoising on Urban100, Fractal-IR surpasses the previous method by 0.2 dB for .

Paper Structure

This paper contains 26 sections, 8 equations, 24 figures, 21 tables.

Figures (24)

  • Figure 1: The proposed Fractal-IR is notable for its efficiency and effectiveness (a)-(b), generalizability across seven image restoration tasks (a)-(g), and improvements in the visual quality of restored images (h)-(j).
  • Figure 2: Illustration of information flow principles. The colors represent local information, with their blending indicating propagation beyond the local region. (a) The CNN-based. (b) The global attention based. (c) Window attention based. (d) The proposed hierarchical information flow prototype.
  • Figure 3:
  • Figure 4:
  • Figure 5:
  • ...and 19 more figures