ISPDiffuser: Learning RAW-to-sRGB Mappings with Texture-Aware Diffusion Models and Histogram-Guided Color Consistency
Yang Ren, Hai Jiang, Menglong Yang, Wei Li, Shuaicheng Liu
TL;DR
ISPDiffuser tackles the RAW-to-sRGB mapping challenge by decoupling detail reconstruction from color mapping. It introduces a texture-aware diffusion model to refine grayscale details and a histogram-guided color consistency module to enforce accurate, DSLR-like colors, optimized via two-stage training with dedicated losses $L_{con}$, $L_{tel}$, and $L_{ccl}$. Across ZRR PyNet and MAI MAI benchmarks, it achieves state-of-the-art perceptual and quantitative metrics, and user studies corroborate its superior visual quality. The approach offers a practical pathway to DSLR-quality sRGB outputs on mobile RAW data, with potential for improved ISP pipelines when paired with efficient inference strategies.
Abstract
RAW-to-sRGB mapping, or the simulation of the traditional camera image signal processor (ISP), aims to generate DSLR-quality sRGB images from raw data captured by smartphone sensors. Despite achieving comparable results to sophisticated handcrafted camera ISP solutions, existing learning-based methods still struggle with detail disparity and color distortion. In this paper, we present ISPDiffuser, a diffusion-based decoupled framework that separates the RAW-to-sRGB mapping into detail reconstruction in grayscale space and color consistency mapping from grayscale to sRGB. Specifically, we propose a texture-aware diffusion model that leverages the generative ability of diffusion models to focus on local detail recovery, in which a texture enrichment loss is further proposed to prompt the diffusion model to generate more intricate texture details. Subsequently, we introduce a histogram-guided color consistency module that utilizes color histogram as guidance to learn precise color information for grayscale to sRGB color consistency mapping, with a color consistency loss designed to constrain the learned color information. Extensive experimental results show that the proposed ISPDiffuser outperforms state-of-the-art competitors both quantitatively and visually. The code is available at https://github.com/RenYangSCU/ISPDiffuser.
