Table of Contents
Fetching ...

Enhancing RAW-to-sRGB with Decoupled Style Structure in Fourier Domain

Xuanhua He, Tao Hu, Guoli Wang, Zejin Wang, Run Wang, Qian Zhang, Keyu Yan, Ziyi Chen, Rui Li, Chenjun Xie, Jie Zhang, Man Zhou

TL;DR

This work addresses the challenge of converting mobile RAW images to DSLR-like sRGB output by decoupling image style (color) and spatial structure in the frequency domain. It introduces FourierISP, a Neural ISP with three specialized subnetworks—PES for structure, ARS for color, and CAS for fusion—that leverage the Fourier amplitude and phase to optimize color and texture separately before integration. The approach uses a multi-term loss that couples spatial, frequency-domain, and phase/amplitude objectives, achieving state-of-the-art results on ZRR and MAI datasets and exhibiting strong transferability across camera pairs. The method demonstrates the practicality of frequency-domain style-structure decoupling for high-quality RAW-to-RGB mapping with robust performance under misalignment scenarios.

Abstract

RAW to sRGB mapping, which aims to convert RAW images from smartphones into RGB form equivalent to that of Digital Single-Lens Reflex (DSLR) cameras, has become an important area of research. However, current methods often ignore the difference between cell phone RAW images and DSLR camera RGB images, a difference that goes beyond the color matrix and extends to spatial structure due to resolution variations. Recent methods directly rebuild color mapping and spatial structure via shared deep representation, limiting optimal performance. Inspired by Image Signal Processing (ISP) pipeline, which distinguishes image restoration and enhancement, we present a novel Neural ISP framework, named FourierISP. This approach breaks the image down into style and structure within the frequency domain, allowing for independent optimization. FourierISP is comprised of three subnetworks: Phase Enhance Subnet for structural refinement, Amplitude Refine Subnet for color learning, and Color Adaptation Subnet for blending them in a smooth manner. This approach sharpens both color and structure, and extensive evaluations across varied datasets confirm that our approach realizes state-of-the-art results. Code will be available at ~\url{https://github.com/alexhe101/FourierISP}.

Enhancing RAW-to-sRGB with Decoupled Style Structure in Fourier Domain

TL;DR

This work addresses the challenge of converting mobile RAW images to DSLR-like sRGB output by decoupling image style (color) and spatial structure in the frequency domain. It introduces FourierISP, a Neural ISP with three specialized subnetworks—PES for structure, ARS for color, and CAS for fusion—that leverage the Fourier amplitude and phase to optimize color and texture separately before integration. The approach uses a multi-term loss that couples spatial, frequency-domain, and phase/amplitude objectives, achieving state-of-the-art results on ZRR and MAI datasets and exhibiting strong transferability across camera pairs. The method demonstrates the practicality of frequency-domain style-structure decoupling for high-quality RAW-to-RGB mapping with robust performance under misalignment scenarios.

Abstract

RAW to sRGB mapping, which aims to convert RAW images from smartphones into RGB form equivalent to that of Digital Single-Lens Reflex (DSLR) cameras, has become an important area of research. However, current methods often ignore the difference between cell phone RAW images and DSLR camera RGB images, a difference that goes beyond the color matrix and extends to spatial structure due to resolution variations. Recent methods directly rebuild color mapping and spatial structure via shared deep representation, limiting optimal performance. Inspired by Image Signal Processing (ISP) pipeline, which distinguishes image restoration and enhancement, we present a novel Neural ISP framework, named FourierISP. This approach breaks the image down into style and structure within the frequency domain, allowing for independent optimization. FourierISP is comprised of three subnetworks: Phase Enhance Subnet for structural refinement, Amplitude Refine Subnet for color learning, and Color Adaptation Subnet for blending them in a smooth manner. This approach sharpens both color and structure, and extensive evaluations across varied datasets confirm that our approach realizes state-of-the-art results. Code will be available at ~\url{https://github.com/alexhe101/FourierISP}.
Paper Structure (17 sections, 8 equations, 8 figures, 3 tables)

This paper contains 17 sections, 8 equations, 8 figures, 3 tables.

Figures (8)

  • Figure 1: Results from the ZRR dataset. Our approach results in clear textures, surpassing other methods.
  • Figure 2: Illustration of image amplitude and phase representation. Clear distinction: phase captures spatial structure, while amplitude encodes color information.
  • Figure 3: Our Model Framework. We employ separate processing for RAW images through packing and demosaicing. These processed images are subsequently fed into PES and ARS to learn the spatial details and style information of the image, respectively. Finally, we integrate the style information into the spatial features using the CAS and produce the final output.
  • Figure 4: The left portion of the figure illustrates the Color Adaptation Block, while the right side showcases the Fourier Amplitude Refine Block.
  • Figure 5: Experimental results from the ZRRdataset. our approach excels in capturing intricate texture details. RAW images are difficult to visualize due to dark local areas.
  • ...and 3 more figures