Table of Contents
Fetching ...

Dropout the High-rate Downsampling: A Novel Design Paradigm for UHD Image Restoration

Chen Wu, Ling Wang, Long Peng, Dianjie Lu, Zhuoran Zheng

TL;DR

D2Net enables direct full-resolution inference on UHD images without the need for high-rate downsampling or dividing the images into several patches, and ingeniously utilizes the characteristics of the frequency domain to establish long-range dependencies of features.

Abstract

With the popularization of high-end mobile devices, Ultra-high-definition (UHD) images have become ubiquitous in our lives. The restoration of UHD images is a highly challenging problem due to the exaggerated pixel count, which often leads to memory overflow during processing. Existing methods either downsample UHD images at a high rate before processing or split them into multiple patches for separate processing. However, high-rate downsampling leads to significant information loss, while patch-based approaches inevitably introduce boundary artifacts. In this paper, we propose a novel design paradigm to solve the UHD image restoration problem, called D2Net. D2Net enables direct full-resolution inference on UHD images without the need for high-rate downsampling or dividing the images into several patches. Specifically, we ingeniously utilize the characteristics of the frequency domain to establish long-range dependencies of features. Taking into account the richer local patterns in UHD images, we also design a multi-scale convolutional group to capture local features. Additionally, during the decoding stage, we dynamically incorporate features from the encoding stage to reduce the flow of irrelevant information. Extensive experiments on three UHD image restoration tasks, including low-light image enhancement, image dehazing, and image deblurring, show that our model achieves better quantitative and qualitative results than state-of-the-art methods.

Dropout the High-rate Downsampling: A Novel Design Paradigm for UHD Image Restoration

TL;DR

D2Net enables direct full-resolution inference on UHD images without the need for high-rate downsampling or dividing the images into several patches, and ingeniously utilizes the characteristics of the frequency domain to establish long-range dependencies of features.

Abstract

With the popularization of high-end mobile devices, Ultra-high-definition (UHD) images have become ubiquitous in our lives. The restoration of UHD images is a highly challenging problem due to the exaggerated pixel count, which often leads to memory overflow during processing. Existing methods either downsample UHD images at a high rate before processing or split them into multiple patches for separate processing. However, high-rate downsampling leads to significant information loss, while patch-based approaches inevitably introduce boundary artifacts. In this paper, we propose a novel design paradigm to solve the UHD image restoration problem, called D2Net. D2Net enables direct full-resolution inference on UHD images without the need for high-rate downsampling or dividing the images into several patches. Specifically, we ingeniously utilize the characteristics of the frequency domain to establish long-range dependencies of features. Taking into account the richer local patterns in UHD images, we also design a multi-scale convolutional group to capture local features. Additionally, during the decoding stage, we dynamically incorporate features from the encoding stage to reduce the flow of irrelevant information. Extensive experiments on three UHD image restoration tasks, including low-light image enhancement, image dehazing, and image deblurring, show that our model achieves better quantitative and qualitative results than state-of-the-art methods.

Paper Structure

This paper contains 16 sections, 8 equations, 8 figures, 6 tables.

Figures (8)

  • Figure 1: Comparison between previous methods and our proposed method for UHD image restoration. Due to hardware limitations, previous methods had to resort to patch-based or downsample-based approaches to enable consumer-grade GPUs to process UHD images. However, patch-based methods LLFormer introduce boundary artifacts during subsequent stitching processes, while downsample-based methods UHDFourNSEN result in significant information loss, both of which can affect the quality of image enhancement. In contrast, our D2Net allows for direct full-resolution inference on UHD images.
  • Figure 2: An overview of the network architecture of our D2Net. The degraded image is forwarded to a 4-level hierarchical encoder-decoder Unet-like structure to get a normal-light image. D2Net primarily consists of several Feature Extraction Modules (FEMs) and an Adaptive Feature Modulation Modules (AFMMs). FEM is responsible for feature extraction through global and local feature modeling. AFMM dynamically adjusts the fusion of encoded and decoded features to suppress the flow of irrelevant information.
  • Figure 3: The illustration of (a) Fourier-based Global Feature Extraction (FGFE) Module, (b) Multi-scale Local Feature Extraction (MLFE) Module and (c) Feedforward Network (FFN).
  • Figure 4: Visual quality comparisons with state-of-the-art methods on UHD-LOL4K dataset. Please zoom in for details.
  • Figure 5: Visual quality comparisons with state-of-the-art methods on UHD-Haze dataset. Please zoom in for details.
  • ...and 3 more figures