RDDM: Practicing RAW Domain Diffusion Model for Real-world Image Restoration
Yan Chen, Yi Wen, Wei Li, Junchao Liu, Yong Guo, Jie Hu, Xinghao Chen
TL;DR
This paper introduces RDDM, a RAW-domain diffusion model for Real-world Image Restoration that restores HQ images directly from sensor RAW data, avoiding irreversible ISP losses. It fuses a RAW-domain VAE (RVAE), Configurable Multi-Bayer LoRA (CMB-LoRA), and a RAW data synthesis pipeline to enable large-scale RAW training, guided by semantic prompts from DAPE. Empirical results show state-of-the-art fidelity and competitive generative quality on real and synthetic RAW benchmarks, with superior cross-sensor generalization and efficiency compared to sRGB-based methods and other diffusion baselines. The approach has practical impact for edge devices and real-world imaging, highlighting the viability of diffusion models operating directly in the RAW domain.
Abstract
We present the RAW domain diffusion model (RDDM), an end-to-end diffusion model that restores photo-realistic images directly from the sensor RAW data. While recent sRGB-domain diffusion methods achieve impressive results, they are caught in a dilemma between high fidelity and image generation. These models process lossy sRGB inputs and neglect the accessibility of the sensor RAW images in many scenarios, e.g., in image and video capturing in edge devices, resulting in sub-optimal performance. RDDM obviates this limitation by directly restoring images in the RAW domain, replacing the conventional two-stage image signal processing (ISP)->Image Restoration (IR) pipeline. However, a simple adaptation of pre-trained diffusion models to the RAW domain confronts many challenges. To this end, we propose: (1) a RAW-domain VAE (RVAE), encoding sensor RAW and decoding it into an enhanced linear domain image, to solve the out-of-distribution (OOD) issues between the different domain distributions; (2) a configurable multi-bayer (CMB) LoRA module, adapting diverse RAW Bayer patterns such as RGGB, BGGR, etc. To compensate for the deficiency in the dataset, we develop a scalable data synthesis pipeline synthesizing RAW LQ-HQ pairs from existing sRGB datasets for large-scale training. Extensive experiments demonstrate RDDM's superiority over state-of-the-art sRGB diffusion methods, yielding higher fidelity results with fewer artifacts. Codes will be publicly available at https://github.com/YanCHEN-fr/RDDM.
