Table of Contents
Fetching ...

Raw-JPEG Adapter: Efficient Raw Image Compression with JPEG

Mahmoud Afifi, Ran Zhang, Michael S. Brown

TL;DR

The paper addresses the challenge of storing raw sensor data efficiently by introducing Raw-JPEG Adapter, a lightweight, invertible pre-processing pipeline that renders raw data compatible with standard JPEG while preserving reconstruction fidelity. It uses a compact network to predict parameters for three invertible operations—a channel-wise gamma map $\boldsymbol{\Gamma}$, 1D RGB LUTs, and an optional global 8×8 DCT scaling $\mathbf{S}$—with these parameters stored in the JPEG COM segment so decoding can exactly recover the original raw via reverse operations: $\hat{\mathbf{I}} = F^{-1}(\mathrm{Dec}(\mathrm{Enc}(F(\mathbf{I};\theta)));\theta)$. Trained in a self-supervised fashion through a differentiable JPEG simulator, the method achieves higher fidelity than direct JPEG storage and several raw-reconstruction baselines across multiple datasets (S24, S7, MIT-Adobe FiveK, NUS) and remains compatible with other codecs (e.g., JPEG 2000, LIC-TCM). The approach delivers substantial storage savings while preserving post-capture editability, enabling high-quality rendering with substantially smaller files (often under a few megabytes) and minimal decoding overhead. The work also introduces metrics like wBPP to account for tonal diversity, and demonstrates strong cross-camera generalization, broad applicability, and potential for post-capture re-rendering without large metadata overhead.

Abstract

Digital cameras digitize scene light into linear raw representations, which the image signal processor (ISP) converts into display-ready outputs. While raw data preserves full sensor information--valuable for editing and vision tasks--formats such as Digital Negative (DNG) require large storage, making them impractical in constrained scenarios. In contrast, JPEG is a widely supported format, offering high compression efficiency and broad compatibility, but it is not well-suited for raw storage. This paper presents RawJPEG Adapter, a lightweight, learnable, and invertible preprocessing pipeline that adapts raw images for standard JPEG compression. Our method applies spatial and optional frequency-domain transforms, with compact parameters stored in the JPEG comment field, enabling accurate raw reconstruction. Experiments across multiple datasets show that our method achieves higher fidelity than direct JPEG storage, supports other codecs, and provides a favorable trade-off between compression ratio and reconstruction accuracy.

Raw-JPEG Adapter: Efficient Raw Image Compression with JPEG

TL;DR

The paper addresses the challenge of storing raw sensor data efficiently by introducing Raw-JPEG Adapter, a lightweight, invertible pre-processing pipeline that renders raw data compatible with standard JPEG while preserving reconstruction fidelity. It uses a compact network to predict parameters for three invertible operations—a channel-wise gamma map , 1D RGB LUTs, and an optional global 8×8 DCT scaling —with these parameters stored in the JPEG COM segment so decoding can exactly recover the original raw via reverse operations: . Trained in a self-supervised fashion through a differentiable JPEG simulator, the method achieves higher fidelity than direct JPEG storage and several raw-reconstruction baselines across multiple datasets (S24, S7, MIT-Adobe FiveK, NUS) and remains compatible with other codecs (e.g., JPEG 2000, LIC-TCM). The approach delivers substantial storage savings while preserving post-capture editability, enabling high-quality rendering with substantially smaller files (often under a few megabytes) and minimal decoding overhead. The work also introduces metrics like wBPP to account for tonal diversity, and demonstrates strong cross-camera generalization, broad applicability, and potential for post-capture re-rendering without large metadata overhead.

Abstract

Digital cameras digitize scene light into linear raw representations, which the image signal processor (ISP) converts into display-ready outputs. While raw data preserves full sensor information--valuable for editing and vision tasks--formats such as Digital Negative (DNG) require large storage, making them impractical in constrained scenarios. In contrast, JPEG is a widely supported format, offering high compression efficiency and broad compatibility, but it is not well-suited for raw storage. This paper presents RawJPEG Adapter, a lightweight, learnable, and invertible preprocessing pipeline that adapts raw images for standard JPEG compression. Our method applies spatial and optional frequency-domain transforms, with compact parameters stored in the JPEG comment field, enabling accurate raw reconstruction. Experiments across multiple datasets show that our method achieves higher fidelity than direct JPEG storage, supports other codecs, and provides a favorable trade-off between compression ratio and reconstruction accuracy.

Paper Structure

This paper contains 26 sections, 21 equations, 18 figures, 12 tables.

Figures (18)

  • Figure 1: We present Raw-JPEG Adapter, a lightweight, learnable pre-processing pipeline that adapts raw images before standard JPEG compression using spatial and optionally frequency domain transforms. The operations are fully invertible, with parameters fitting in the JPEG comment field ($<$64 KB), enabling accurate raw reconstruction after JPEG decoding and significantly reducing file size. In this figure, (A) shows the original raw (DNG), stored as JPEG with high compression (quality 25) without our method in (B), and with our method in (C). Error maps for (B) and (C) are shown on the right. All raw images shown in this paper are tone-mapped for visualization.
  • Figure 2: Raw-JPEG Adapter achieves a favorable trade-off between bits per pixel (BPP), raw image fidelity, and runtime overhead compared to storing raw images with JPEG or using state-of-the-art raw reconstruction methods (INF INF and R2LCM R2LCM1R2LCM2). Shown are normalized metrics (PSNR, SSIM, MS-SSIM, inverted BPP, and inverted runtime overhead) on the S24 dataset S24, with JPEG quality 95 for both the baseline JPEG and our method.
  • Figure 3: Our Raw-JPEG Adapter uses a lightweight network ($\sim$37K parameters) to process a thumbnail of the raw image and produce parameters for a pixel-wise gamma operator, RGB 1D tone-mapping lookup tables, and an optional DCT-based component applied globally in the frequency domain over 8$\times$8 blocks. These transformations are applied before saving the image as a JPEG, while the associated parameters are compressed and embedded in the JPEG file’s comment (COM) segment ($<$64 KB). All intermediate feature dimensionalities are annotated in the figure, with channel counts shown in red and spatial dimensions in purple. At decoding time, the parameters are retrieved, inverted, and applied to the stored image to reconstruct the original raw content. During training, the JPEG step is replaced with a differentiable simulator, and the network is optimized in a self-supervised manner.
  • Figure 4: Qualitative example from the S24 test set S24 at JPEG quality 25. (A) Uncompressed demosaiced raw image. (B) JPEG-compressed raw without pre-/post-processing, with its error map. (C) JPEG image produced by our Raw-JPEG Adapter using the predicted RGB LuTs and gamma map. (D) Decoded raw image by our Raw-JPEG Adapter, along with the corresponding error map.
  • Figure 5: On the left and right, we show sRGB images rendered by Adobe Lightroom and LiteISP lite-isp, respectively, from: (B) the uncompressed raw image in (A) stored as a DNG file, (C) the JPEG-compressed raw image without our method, and (D) the JPEG-compressed raw image with our method. The raw JPEG images used to produce (C) and (D) were saved at quality level 50.
  • ...and 13 more figures