Table of Contents
Fetching ...

NM-FlowGAN: Modeling sRGB Noise without Paired Images using a Hybrid Approach of Normalizing Flows and GAN

Young Joo Han, Ha-Jin Yu

TL;DR

The proposed NM-FlowGAN is a hybrid approach that exploits the strengths of both GAN and Normalizing Flows, and synthesizes noise using clean images and factors that affect noise characteristics, making it applicable to various fields where obtaining noisy-clean image pairs is not feasible.

Abstract

Modeling and synthesizing real sRGB noise is crucial for various low-level vision tasks, such as building datasets for training image denoising systems. The distribution of real sRGB noise is highly complex and affected by a multitude of factors, making its accurate modeling extremely challenging. Therefore, recent studies have proposed methods that employ data-driven generative models, such as Generative Adversarial Networks (GAN) and Normalizing Flows. These studies achieve more accurate modeling of sRGB noise compared to traditional noise modeling methods. However, there are performance limitations due to the inherent characteristics of each generative model. To address this issue, we propose NM-FlowGAN, a hybrid approach that exploits the strengths of both GAN and Normalizing Flows. We combine pixel-wise noise modeling networks based on Normalizing Flows and spatial correlation modeling networks based on GAN. Specifically, the pixel-wise noise modeling network leverages the high training stability of Normalizing Flows to capture noise characteristics that are affected by a multitude of factors, and the spatial correlation networks efficiently model pixel-to-pixel relationships. In particular, unlike recent methods that rely on paired noisy images, our method synthesizes noise using clean images and factors that affect noise characteristics, such as easily obtainable parameters like camera type and ISO settings, making it applicable to various fields where obtaining noisy-clean image pairs is not feasible. In our experiments, our NM-FlowGAN outperforms other baselines in the sRGB noise synthesis task. Moreover, the denoising neural network trained with synthesized image pairs from our model shows superior performance compared to other baselines. Our code is available at: \url{https://github.com/YoungJooHan/NM-FlowGAN}.

NM-FlowGAN: Modeling sRGB Noise without Paired Images using a Hybrid Approach of Normalizing Flows and GAN

TL;DR

The proposed NM-FlowGAN is a hybrid approach that exploits the strengths of both GAN and Normalizing Flows, and synthesizes noise using clean images and factors that affect noise characteristics, making it applicable to various fields where obtaining noisy-clean image pairs is not feasible.

Abstract

Modeling and synthesizing real sRGB noise is crucial for various low-level vision tasks, such as building datasets for training image denoising systems. The distribution of real sRGB noise is highly complex and affected by a multitude of factors, making its accurate modeling extremely challenging. Therefore, recent studies have proposed methods that employ data-driven generative models, such as Generative Adversarial Networks (GAN) and Normalizing Flows. These studies achieve more accurate modeling of sRGB noise compared to traditional noise modeling methods. However, there are performance limitations due to the inherent characteristics of each generative model. To address this issue, we propose NM-FlowGAN, a hybrid approach that exploits the strengths of both GAN and Normalizing Flows. We combine pixel-wise noise modeling networks based on Normalizing Flows and spatial correlation modeling networks based on GAN. Specifically, the pixel-wise noise modeling network leverages the high training stability of Normalizing Flows to capture noise characteristics that are affected by a multitude of factors, and the spatial correlation networks efficiently model pixel-to-pixel relationships. In particular, unlike recent methods that rely on paired noisy images, our method synthesizes noise using clean images and factors that affect noise characteristics, such as easily obtainable parameters like camera type and ISO settings, making it applicable to various fields where obtaining noisy-clean image pairs is not feasible. In our experiments, our NM-FlowGAN outperforms other baselines in the sRGB noise synthesis task. Moreover, the denoising neural network trained with synthesized image pairs from our model shows superior performance compared to other baselines. Our code is available at: \url{https://github.com/YoungJooHan/NM-FlowGAN}.
Paper Structure (20 sections, 20 equations, 10 figures, 8 tables)

This paper contains 20 sections, 20 equations, 10 figures, 8 tables.

Figures (10)

  • Figure 1: Two-step pipeline for training real-world image denoising networks with noise modeling networks. (a) First, our method trains noise modeling networks called NM-FlowGAN to synthesize noise $\tilde{n}$ using clean images $x$ and camera parameters $\delta$ and $\gamma$, which represent the camera type and ISO setting. (b) Once trained, NM-FlowGAN synthesizes noise that is added to clean images, generating synthesized noisy-clean image pairs that are used to train image denoising networks.
  • Figure 2: Illustrations of the relationship between the standard deviation of the sRGB noise and clean image intensity under various conditions. These plots demonstrate that sRGB noise exhibits unpredictable and complex distributions. In addition, they show that the noise distribution of the sRGB noise varies depending on conditions such as camera type, ISO, and scene.
  • Figure 3: An illustration of the correlation $r$ versus distance $d$ between neighboring pixels under various camera types and ISO settings. This figure shows that the noise value of a pixel is highly correlated with its neighboring pixels and shows similar behavior across camera types or ISO settings.
  • Figure 4: The overall architecture of our proposed framework for noise synthesis. Our NM-FlowGAN is comprised of two main components: the pixel-wise noise modeling network and the spatial correlation modeling network. These networks are based on Normalizing Flows and GAN, respectively. We employ a dequantization layer at the beginning of the pixel-wise noise modeling network, which adds uniformly sampled noise to the images during training. For better visualization, the magnitudes of the noise images are amplified.
  • Figure 5: The detailed architectures of our conditional linear flow layers: SDL and SAL. These layers are invertible, but we describe the forward pass of conditional linear flows. In the figure, $\odot$ denotes channel-wise concatenation, $\oplus$ and $\otimes$ denote element-wise addition and multiplication, respectively.
  • ...and 5 more figures