Table of Contents
Fetching ...

A Generative Model for Digital Camera Noise Synthesis

Mingyang Song, Yang Zhang, Tunç O. Aydın, Elham Amin Mansour, Christopher Schroers

TL;DR

This work tackles realistic camera noise synthesis conditioned on camera settings, a challenging problem for data augmentation and denoising. It introduces CFG-NIN, a conditional GAN-based generator with a Simple NAF (SNAF) architecture that seeds stochastic noise via a transition-point Gaussian map and injects noise at multiple decoder stages, all guided by a control map CM. A Style Loss based on Gram features from a pre-trained network supervises distributional realism, stabilizing GAN training while preserving temporal variance and spatial correlations. Empirical results in both $sRGB$ and $rawRGB$ spaces show superior distribution matching and favorable denoising transfer, indicating practical impact for denoising, video codecs, and synthetic noise generation.

Abstract

Noise synthesis is a challenging low-level vision task aiming to generate realistic noise given a clean image along with the camera settings. To this end, we propose an effective generative model which utilizes clean features as guidance followed by noise injections into the network. Specifically, our generator follows a UNet-like structure with skip connections but without downsampling and upsampling layers. Firstly, we extract deep features from a clean image as the guidance and concatenate a Gaussian noise map to the transition point between the encoder and decoder as the noise source. Secondly, we propose noise synthesis blocks in the decoder in each of which we inject Gaussian noise to model the noise characteristics. Thirdly, we propose to utilize an additional Style Loss and demonstrate that this allows better noise characteristics supervision in the generator. Through a number of new experiments, we evaluate the temporal variance and the spatial correlation of the generated noise which we hope can provide meaningful insights for future works. Finally, we show that our proposed approach outperforms existing methods for synthesizing camera noise.

A Generative Model for Digital Camera Noise Synthesis

TL;DR

This work tackles realistic camera noise synthesis conditioned on camera settings, a challenging problem for data augmentation and denoising. It introduces CFG-NIN, a conditional GAN-based generator with a Simple NAF (SNAF) architecture that seeds stochastic noise via a transition-point Gaussian map and injects noise at multiple decoder stages, all guided by a control map CM. A Style Loss based on Gram features from a pre-trained network supervises distributional realism, stabilizing GAN training while preserving temporal variance and spatial correlations. Empirical results in both and spaces show superior distribution matching and favorable denoising transfer, indicating practical impact for denoising, video codecs, and synthetic noise generation.

Abstract

Noise synthesis is a challenging low-level vision task aiming to generate realistic noise given a clean image along with the camera settings. To this end, we propose an effective generative model which utilizes clean features as guidance followed by noise injections into the network. Specifically, our generator follows a UNet-like structure with skip connections but without downsampling and upsampling layers. Firstly, we extract deep features from a clean image as the guidance and concatenate a Gaussian noise map to the transition point between the encoder and decoder as the noise source. Secondly, we propose noise synthesis blocks in the decoder in each of which we inject Gaussian noise to model the noise characteristics. Thirdly, we propose to utilize an additional Style Loss and demonstrate that this allows better noise characteristics supervision in the generator. Through a number of new experiments, we evaluate the temporal variance and the spatial correlation of the generated noise which we hope can provide meaningful insights for future works. Finally, we show that our proposed approach outperforms existing methods for synthesizing camera noise.
Paper Structure (27 sections, 14 equations, 19 figures, 11 tables)

This paper contains 27 sections, 14 equations, 19 figures, 11 tables.

Figures (19)

  • Figure 1: Comparison of real and deep-learning-based synthetic noisy images.
  • Figure 2: The architecture of our generator and discriminator.
  • Figure 3: Components of SNAF and SNAF-NI block.
  • Figure 4: Visual comparison with the baseline methods on noise modeling task in sRGB space. Zoom in for better visualization.
  • Figure 5: Clean image's intensity vs. variance of noise in sRGB space. All the sub-figures share the same legends as the first row for simplicity. The columns demonstrate the results on R, G and B channels separately.
  • ...and 14 more figures