Table of Contents
Fetching ...

Local Implicit Wavelet Transformer for Arbitrary-Scale Super-Resolution

Minghong Duan, Linhao Qu, Shaolei Liu, Manning Wang

TL;DR

The Local Implicit Wavelet Transformer (LIWT) is proposed to enhance the restoration of high-frequency texture details in images, and achieves promising performance in arbitrary-scale SR tasks, outperforming other state-of-the-art methods.

Abstract

Implicit neural representations have recently demonstrated promising potential in arbitrary-scale Super-Resolution (SR) of images. Most existing methods predict the pixel in the SR image based on the queried coordinate and ensemble nearby features, overlooking the importance of incorporating high-frequency prior information in images, which results in limited performance in reconstructing high-frequency texture details in images. To address this issue, we propose the Local Implicit Wavelet Transformer (LIWT) to enhance the restoration of high-frequency texture details. Specifically, we decompose the features extracted by an encoder into four sub-bands containing different frequency information using Discrete Wavelet Transform (DWT). We then introduce the Wavelet Enhanced Residual Module (WERM) to transform these four sub-bands into high-frequency priors, followed by utilizing the Wavelet Mutual Projected Fusion (WMPF) and the Wavelet-aware Implicit Attention (WIA) to fully exploit the high-frequency prior information for recovering high-frequency details in images. We conducted extensive experiments on benchmark datasets to validate the effectiveness of LIWT. Both qualitative and quantitative results demonstrate that LIWT achieves promising performance in arbitrary-scale SR tasks, outperforming other state-of-the-art methods. The code is available at https://github.com/dmhdmhdmh/LIWT.

Local Implicit Wavelet Transformer for Arbitrary-Scale Super-Resolution

TL;DR

The Local Implicit Wavelet Transformer (LIWT) is proposed to enhance the restoration of high-frequency texture details in images, and achieves promising performance in arbitrary-scale SR tasks, outperforming other state-of-the-art methods.

Abstract

Implicit neural representations have recently demonstrated promising potential in arbitrary-scale Super-Resolution (SR) of images. Most existing methods predict the pixel in the SR image based on the queried coordinate and ensemble nearby features, overlooking the importance of incorporating high-frequency prior information in images, which results in limited performance in reconstructing high-frequency texture details in images. To address this issue, we propose the Local Implicit Wavelet Transformer (LIWT) to enhance the restoration of high-frequency texture details. Specifically, we decompose the features extracted by an encoder into four sub-bands containing different frequency information using Discrete Wavelet Transform (DWT). We then introduce the Wavelet Enhanced Residual Module (WERM) to transform these four sub-bands into high-frequency priors, followed by utilizing the Wavelet Mutual Projected Fusion (WMPF) and the Wavelet-aware Implicit Attention (WIA) to fully exploit the high-frequency prior information for recovering high-frequency details in images. We conducted extensive experiments on benchmark datasets to validate the effectiveness of LIWT. Both qualitative and quantitative results demonstrate that LIWT achieves promising performance in arbitrary-scale SR tasks, outperforming other state-of-the-art methods. The code is available at https://github.com/dmhdmhdmh/LIWT.

Paper Structure

This paper contains 17 sections, 7 equations, 7 figures, 13 tables.

Figures (7)

  • Figure 1: Motivation and effectiveness of our method. (a) LIIF chen2021learning and LTE lee2022local have difficulty reconstructing high-frequency details using the local ensemble technique. (b) LIWT introduces the high-frequency prior via DWT. (c) LIWT can reconstruct high-frequency details using attention weight based on the high-frequency prior.
  • Figure 2: (a) Overview of the proposed framework. (b) Diagram of the local grid. (c) Structure of the WIA. (d) Structure of the WERM.
  • Figure 3: Structure of the WHFERB and the WMPF.
  • Figure 4: Visual comparison of MetaSR hu2019meta, LIIF chen2021learning, LTE lee2022local, and LIWT using RDN zhang2018residual as the encoder. Zoom in for best view.
  • Figure 5: (a) Visualization of feature maps for each stage of WERM; (b) Attention map visualization for WIA.
  • ...and 2 more figures