Table of Contents
Fetching ...

SimROD: A Simple Baseline for Raw Object Detection with Global and Local Enhancements

Haiyang Xie, Xi Shen, Shihua Huang, Qirui Wang, Zheng Wang

TL;DR

SimROD introduces a minimalistic yet effective RAW-object-detection pipeline that bypasses ISP by leveraging two key ideas: a Global Gamma Enhancement (GGE) with four learnable per-channel parameters and a Green-Guided Local Enhancement (GGLE) that exploits the green channel’s high-frequency information. The approach is end-to-end trainable and dramatically lightweight (≈0.003M additional parameters) while achieving state-of-the-art results on RAW benchmarks such as ROD, LOD, and Pascal-Raw, and strong performance on ADE20K-Raw segmentation. Empirically, GGE provides essential global normalization, GGLE refines local details, and the green channel guidance is consistently more beneficial than other channel cues, especially in low-light conditions. The findings highlight RAW data as a practical alternative to RGB pipelines in real-world detectors, offering lower hardware complexity and latency without sacrificing accuracy.

Abstract

Most visual models are designed for sRGB images, yet RAW data offers significant advantages for object detection by preserving sensor information before ISP processing. This enables improved detection accuracy and more efficient hardware designs by bypassing the ISP. However, RAW object detection is challenging due to limited training data, unbalanced pixel distributions, and sensor noise. To address this, we propose SimROD, a lightweight and effective approach for RAW object detection. We introduce a Global Gamma Enhancement (GGE) module, which applies a learnable global gamma transformation with only four parameters, improving feature representation while keeping the model efficient. Additionally, we leverage the green channel's richer signal to enhance local details, aligning with the human eye's sensitivity and Bayer filter design. Extensive experiments on multiple RAW object detection datasets and detectors demonstrate that SimROD outperforms state-of-the-art methods like RAW-Adapter and DIAP while maintaining efficiency. Our work highlights the potential of RAW data for real-world object detection. Code is available at https://ocean146.github.io/SimROD2025/.

SimROD: A Simple Baseline for Raw Object Detection with Global and Local Enhancements

TL;DR

SimROD introduces a minimalistic yet effective RAW-object-detection pipeline that bypasses ISP by leveraging two key ideas: a Global Gamma Enhancement (GGE) with four learnable per-channel parameters and a Green-Guided Local Enhancement (GGLE) that exploits the green channel’s high-frequency information. The approach is end-to-end trainable and dramatically lightweight (≈0.003M additional parameters) while achieving state-of-the-art results on RAW benchmarks such as ROD, LOD, and Pascal-Raw, and strong performance on ADE20K-Raw segmentation. Empirically, GGE provides essential global normalization, GGLE refines local details, and the green channel guidance is consistently more beneficial than other channel cues, especially in low-light conditions. The findings highlight RAW data as a practical alternative to RGB pipelines in real-world detectors, offering lower hardware complexity and latency without sacrificing accuracy.

Abstract

Most visual models are designed for sRGB images, yet RAW data offers significant advantages for object detection by preserving sensor information before ISP processing. This enables improved detection accuracy and more efficient hardware designs by bypassing the ISP. However, RAW object detection is challenging due to limited training data, unbalanced pixel distributions, and sensor noise. To address this, we propose SimROD, a lightweight and effective approach for RAW object detection. We introduce a Global Gamma Enhancement (GGE) module, which applies a learnable global gamma transformation with only four parameters, improving feature representation while keeping the model efficient. Additionally, we leverage the green channel's richer signal to enhance local details, aligning with the human eye's sensitivity and Bayer filter design. Extensive experiments on multiple RAW object detection datasets and detectors demonstrate that SimROD outperforms state-of-the-art methods like RAW-Adapter and DIAP while maintaining efficiency. Our work highlights the potential of RAW data for real-world object detection. Code is available at https://ocean146.github.io/SimROD2025/.

Paper Structure

This paper contains 39 sections, 5 equations, 7 figures, 11 tables.

Figures (7)

  • Figure 1: Top:Advantages of RAW Data for Object Detection. Using RAW data eliminates the need for an ISP, reducing system complexity, latency, and cost—crucial for lightweight, real-time applications (Figure \ref{['fig_teaser_a']}). Bottom:Key Insights in SimROD. The green channel in RAW data carries more detailed information. The percentages indicate the proportion of color pixels with the highest intensity in the RGB channels—higher values mean richer details and lower noise in challenging lighting conditions (Figure \ref{['fig_teaser_b']}).
  • Figure 2: Left: We evaluate RAW object detection on the LOD dataset hong2021crafting using individual color channels—green (G), red (R), and blue (B)—with the state-of-the-art DIAP method xu2023toward. The results highlight the superior performance of G. Right: G has a significantly higher SNR than R and B, suggesting it may be more resistant to noise in extreme lighting conditions, potentially improving robustness.
  • Figure 3: The overview of our proposed SimROD. Our SimROD takes a packed RAW image as input and first learns a global gamma transformation through the Global Gamma Enhancement (GGE) module. The transformed data is then processed by Green-Guided Local Enhancement (GGLE) to enhance local details.
  • Figure 4: $\gamma$ across epochs on LOD and Pascal-Raw.
  • Figure 5: The plots illustrate the sensitivity of mAP to varying gamma_min($\gamma_{min}$) and gamma_max($\gamma_{max}$) defined in our GGE. The results show minimal performance variation across different gamma ranges, indicating robust detection performance within the tested parameter bounds.
  • ...and 2 more figures