Table of Contents
Fetching ...

RDDM: Practicing RAW Domain Diffusion Model for Real-world Image Restoration

Yan Chen, Yi Wen, Wei Li, Junchao Liu, Yong Guo, Jie Hu, Xinghao Chen

TL;DR

This paper introduces RDDM, a RAW-domain diffusion model for Real-world Image Restoration that restores HQ images directly from sensor RAW data, avoiding irreversible ISP losses. It fuses a RAW-domain VAE (RVAE), Configurable Multi-Bayer LoRA (CMB-LoRA), and a RAW data synthesis pipeline to enable large-scale RAW training, guided by semantic prompts from DAPE. Empirical results show state-of-the-art fidelity and competitive generative quality on real and synthetic RAW benchmarks, with superior cross-sensor generalization and efficiency compared to sRGB-based methods and other diffusion baselines. The approach has practical impact for edge devices and real-world imaging, highlighting the viability of diffusion models operating directly in the RAW domain.

Abstract

We present the RAW domain diffusion model (RDDM), an end-to-end diffusion model that restores photo-realistic images directly from the sensor RAW data. While recent sRGB-domain diffusion methods achieve impressive results, they are caught in a dilemma between high fidelity and image generation. These models process lossy sRGB inputs and neglect the accessibility of the sensor RAW images in many scenarios, e.g., in image and video capturing in edge devices, resulting in sub-optimal performance. RDDM obviates this limitation by directly restoring images in the RAW domain, replacing the conventional two-stage image signal processing (ISP)->Image Restoration (IR) pipeline. However, a simple adaptation of pre-trained diffusion models to the RAW domain confronts many challenges. To this end, we propose: (1) a RAW-domain VAE (RVAE), encoding sensor RAW and decoding it into an enhanced linear domain image, to solve the out-of-distribution (OOD) issues between the different domain distributions; (2) a configurable multi-bayer (CMB) LoRA module, adapting diverse RAW Bayer patterns such as RGGB, BGGR, etc. To compensate for the deficiency in the dataset, we develop a scalable data synthesis pipeline synthesizing RAW LQ-HQ pairs from existing sRGB datasets for large-scale training. Extensive experiments demonstrate RDDM's superiority over state-of-the-art sRGB diffusion methods, yielding higher fidelity results with fewer artifacts. Codes will be publicly available at https://github.com/YanCHEN-fr/RDDM.

RDDM: Practicing RAW Domain Diffusion Model for Real-world Image Restoration

TL;DR

This paper introduces RDDM, a RAW-domain diffusion model for Real-world Image Restoration that restores HQ images directly from sensor RAW data, avoiding irreversible ISP losses. It fuses a RAW-domain VAE (RVAE), Configurable Multi-Bayer LoRA (CMB-LoRA), and a RAW data synthesis pipeline to enable large-scale RAW training, guided by semantic prompts from DAPE. Empirical results show state-of-the-art fidelity and competitive generative quality on real and synthetic RAW benchmarks, with superior cross-sensor generalization and efficiency compared to sRGB-based methods and other diffusion baselines. The approach has practical impact for edge devices and real-world imaging, highlighting the viability of diffusion models operating directly in the RAW domain.

Abstract

We present the RAW domain diffusion model (RDDM), an end-to-end diffusion model that restores photo-realistic images directly from the sensor RAW data. While recent sRGB-domain diffusion methods achieve impressive results, they are caught in a dilemma between high fidelity and image generation. These models process lossy sRGB inputs and neglect the accessibility of the sensor RAW images in many scenarios, e.g., in image and video capturing in edge devices, resulting in sub-optimal performance. RDDM obviates this limitation by directly restoring images in the RAW domain, replacing the conventional two-stage image signal processing (ISP)->Image Restoration (IR) pipeline. However, a simple adaptation of pre-trained diffusion models to the RAW domain confronts many challenges. To this end, we propose: (1) a RAW-domain VAE (RVAE), encoding sensor RAW and decoding it into an enhanced linear domain image, to solve the out-of-distribution (OOD) issues between the different domain distributions; (2) a configurable multi-bayer (CMB) LoRA module, adapting diverse RAW Bayer patterns such as RGGB, BGGR, etc. To compensate for the deficiency in the dataset, we develop a scalable data synthesis pipeline synthesizing RAW LQ-HQ pairs from existing sRGB datasets for large-scale training. Extensive experiments demonstrate RDDM's superiority over state-of-the-art sRGB diffusion methods, yielding higher fidelity results with fewer artifacts. Codes will be publicly available at https://github.com/YanCHEN-fr/RDDM.

Paper Structure

This paper contains 16 sections, 13 equations, 20 figures, 6 tables.

Figures (20)

  • Figure 1: RDDM, restoring directly from the sensor RAW data, demonstrates remarkable results shown in (a), capitalizing on the unprocessed and detail-rich signal. Compared with the two-stage baseline in (b), RDDM delivers markedly higher fidelity and perceptual quality.
  • Figure 2: Distribution gap between RAW and sRGB images.
  • Figure 3: The performance comparison among SD-based methods on test datasets DIV2K-Val, and RealSR, respectively.
  • Figure 4: Comparison of Real-IR paradigms. (a) The prevailing paradigm: the restoration process is performed in the sRGB space after the Raw-to-sRGB mapping via an ISP. (b) RDDM paradigm: we perform restoration directly on the RAW data and subsequently map the restored results to the sRGB space using a PTP module.
  • Figure 5: (a) Illustration of RVAE training strategy. (b) RDDM framework. Specifically, it synchronizes the optimization of the CMB-LoRA (within the RVAE encoder) and the pre-trained diffusion network, supervised by RAW-linear image pairs.
  • ...and 15 more figures