Table of Contents
Fetching ...

Realistic Noise Synthesis with Diffusion Models

Qi Wu, Mingyan Han, Ting Jiang, Chengzhi Jiang, Jinting Luo, Man Jiang, Haoqiang Fan, Shuaicheng Liu

TL;DR

Real-world RGB noise is irregular and tied to ISP processing and sensor factors, complicating denoising data collection. RNSD leverages diffusion models conditioned on clean content and camera settings, introducing TCCAM for time-aware affine modulation, MCAM for multi-scale content guidance, and DIPS for accelerated sampling, with forward dynamics $q(x_t|x_{t-1}) = \mathcal{N}(x_t; \sqrt{1-\beta_t} x_{t-1}, \beta_t I)$. The method yields higher realism (lower AKLD and PGap) and improves denoising PSNR/SSIM when used for augmentation, while DIPS reduces sampling from 1000 to 5 steps with minimal accuracy loss. Overall, RNSD provides a scalable, high-fidelity noise synthesis pipeline that enhances denoising performance across diverse camera sensors and ISP configurations.

Abstract

Deep denoising models require extensive real-world training data, which is challenging to acquire. Current noise synthesis techniques struggle to accurately model complex noise distributions. We propose a novel Realistic Noise Synthesis Diffusor (RNSD) method using diffusion models to address these challenges. By encoding camera settings into a time-aware camera-conditioned affine modulation (TCCAM), RNSD generates more realistic noise distributions under various camera conditions. Additionally, RNSD integrates a multi-scale content-aware module (MCAM), enabling the generation of structured noise with spatial correlations across multiple frequencies. We also introduce Deep Image Prior Sampling (DIPS), a learnable sampling sequence based on depth image prior, which significantly accelerates the sampling process while maintaining the high quality of synthesized noise. Extensive experiments demonstrate that our RNSD method significantly outperforms existing techniques in synthesizing realistic noise under multiple metrics and improving image denoising performance.

Realistic Noise Synthesis with Diffusion Models

TL;DR

Real-world RGB noise is irregular and tied to ISP processing and sensor factors, complicating denoising data collection. RNSD leverages diffusion models conditioned on clean content and camera settings, introducing TCCAM for time-aware affine modulation, MCAM for multi-scale content guidance, and DIPS for accelerated sampling, with forward dynamics . The method yields higher realism (lower AKLD and PGap) and improves denoising PSNR/SSIM when used for augmentation, while DIPS reduces sampling from 1000 to 5 steps with minimal accuracy loss. Overall, RNSD provides a scalable, high-fidelity noise synthesis pipeline that enhances denoising performance across diverse camera sensors and ISP configurations.

Abstract

Deep denoising models require extensive real-world training data, which is challenging to acquire. Current noise synthesis techniques struggle to accurately model complex noise distributions. We propose a novel Realistic Noise Synthesis Diffusor (RNSD) method using diffusion models to address these challenges. By encoding camera settings into a time-aware camera-conditioned affine modulation (TCCAM), RNSD generates more realistic noise distributions under various camera conditions. Additionally, RNSD integrates a multi-scale content-aware module (MCAM), enabling the generation of structured noise with spatial correlations across multiple frequencies. We also introduce Deep Image Prior Sampling (DIPS), a learnable sampling sequence based on depth image prior, which significantly accelerates the sampling process while maintaining the high quality of synthesized noise. Extensive experiments demonstrate that our RNSD method significantly outperforms existing techniques in synthesizing realistic noise under multiple metrics and improving image denoising performance.
Paper Structure (14 sections, 9 equations, 8 figures, 7 tables, 2 algorithms)

This paper contains 14 sections, 9 equations, 8 figures, 7 tables, 2 algorithms.

Figures (8)

  • Figure 1: Subjective results and AKLD yue2020danet of various noise synthesis methods, including sRGB2Flow kousha2022modeling, DANet yue2020danet, and C2N jang2021c2n.
  • Figure 2: (a) the pipeline of noise generation via diffusion. (b) The pipeline of our TCCAM. (c) The architecture of the UNet with MCAM we designed.
  • Figure 3: Illustration of the motivation behind DIPS. Upper figure illustrates the variation of AKLD with respect to the steps in DDPM. Lower figure presents a selection of images generated at 0, 100, 500, and 800 steps.
  • Figure 4: Noisy synthesis samples from different methods, including C2N, sRGB2Flow, DANet and RNSD.
  • Figure 5: Noise synthesis results with camera settings.
  • ...and 3 more figures