Table of Contents
Fetching ...

Invertible Residual Rescaling Models

Jinmin Li, Tao Dai, Yaohua Zha, Yilu Luo, Longfei Lu, Bin Chen, Zhi Wang, Shu-Tao Xia, Jingyun Zhang

TL;DR

This work proposes Invertible Residual Rescaling Models (IRRM) for image rescaling by learning a bijection between a high-resolution image and its low-resolution counterpart with a specific distribution, which allows rich low-frequency information to be bypassed by skip connections and forces models to focus on extracting high-frequency information from the image.

Abstract

Invertible Rescaling Networks (IRNs) and their variants have witnessed remarkable achievements in various image processing tasks like image rescaling. However, we observe that IRNs with deeper networks are difficult to train, thus hindering the representational ability of IRNs. To address this issue, we propose Invertible Residual Rescaling Models (IRRM) for image rescaling by learning a bijection between a high-resolution image and its low-resolution counterpart with a specific distribution. Specifically, we propose IRRM to build a deep network, which contains several Residual Downscaling Modules (RDMs) with long skip connections. Each RDM consists of several Invertible Residual Blocks (IRBs) with short connections. In this way, RDM allows rich low-frequency information to be bypassed by skip connections and forces models to focus on extracting high-frequency information from the image. Extensive experiments show that our IRRM performs significantly better than other state-of-the-art methods with much fewer parameters and complexity. Particularly, our IRRM has respectively PSNR gains of at least 0.3 dB over HCFlow and IRN in the x4 rescaling while only using 60% parameters and 50% FLOPs. The code will be available at https://github.com/THU-Kingmin/IRRM.

Invertible Residual Rescaling Models

TL;DR

This work proposes Invertible Residual Rescaling Models (IRRM) for image rescaling by learning a bijection between a high-resolution image and its low-resolution counterpart with a specific distribution, which allows rich low-frequency information to be bypassed by skip connections and forces models to focus on extracting high-frequency information from the image.

Abstract

Invertible Rescaling Networks (IRNs) and their variants have witnessed remarkable achievements in various image processing tasks like image rescaling. However, we observe that IRNs with deeper networks are difficult to train, thus hindering the representational ability of IRNs. To address this issue, we propose Invertible Residual Rescaling Models (IRRM) for image rescaling by learning a bijection between a high-resolution image and its low-resolution counterpart with a specific distribution. Specifically, we propose IRRM to build a deep network, which contains several Residual Downscaling Modules (RDMs) with long skip connections. Each RDM consists of several Invertible Residual Blocks (IRBs) with short connections. In this way, RDM allows rich low-frequency information to be bypassed by skip connections and forces models to focus on extracting high-frequency information from the image. Extensive experiments show that our IRRM performs significantly better than other state-of-the-art methods with much fewer parameters and complexity. Particularly, our IRRM has respectively PSNR gains of at least 0.3 dB over HCFlow and IRN in the x4 rescaling while only using 60% parameters and 50% FLOPs. The code will be available at https://github.com/THU-Kingmin/IRRM.
Paper Structure (16 sections, 6 equations, 8 figures, 4 tables)

This paper contains 16 sections, 6 equations, 8 figures, 4 tables.

Figures (8)

  • Figure 1: Comparison results of state-of-the-art methods (like IRN and HCFlow) and our IRRM on Urban100 dataset with ×4 rescaling. SR methods combined with bicubic downscaling are also reported. -S, -M, and -L represent the sizes of model parameters respectively, where -S represents small, -M represents medium, and -L represents large. IRRM achieves similar performance using a quarter of parameters and FLOPs of the IRN and HCFlow.
  • Figure 2: Illustration of training gradients for different models. IRRM with residual connections and enhanced residual block ($\text{IRRM\_Res\_RB}$) gain a more stable gradient than IRRM w/o residual connections ($\text{IRRM\_PCB}$), leading to faster convergence and better performance.
  • Figure 3: The overall framework of Invertible Residual Rescaling Models (IRRM). IRRM is composed of Residual Downscaling Modules (RDMs), in which Invertible Residual Blocks (IRBs) are stacked after a wavelet transformation. Each IRB contains three Enhanced Blocks (EBs) to enhance the nonlinear representation and mitigate vanishing the gradient problem.
  • Figure 4: Illustration of Enhanced Blocks (EBs). Three non-linear convolutional blocks in the Invertible Residual Block are compared.
  • Figure 5: Visual results of upscaling the $4\times$ downscaled images. The right images are $128 \times 128$ which is a patch of the left images. IRRM recovers rich textures and realistic details, leading to better recovery performance. IRRM achieves better performance with an increased PSNR of 4 dB over RCAN and 0.6 dB over IRN
  • ...and 3 more figures