Table of Contents
Fetching ...

Towards RAW Object Detection in Diverse Conditions

Zhong-Yu Li, Xin Jin, Boyuan Sun, Chun-Le Guo, Ming-Ming Cheng

TL;DR

This work introduces the AODRaw dataset, which offers 7,785 high-resolution real RAW images with 135,601 annotated instances spanning 62 categories, capturing a broad range of indoor and outdoor scenes under 9 distinct light and weather conditions, and provides a comprehensive benchmark for evaluating current detection methods.

Abstract

Existing object detection methods often consider sRGB input, which was compressed from RAW data using ISP originally designed for visualization. However, such compression might lose crucial information for detection, especially under complex light and weather conditions. We introduce the AODRaw dataset, which offers 7,785 high-resolution real RAW images with 135,601 annotated instances spanning 62 categories, capturing a broad range of indoor and outdoor scenes under 9 distinct light and weather conditions. Based on AODRaw that supports RAW and sRGB object detection, we provide a comprehensive benchmark for evaluating current detection methods. We find that sRGB pre-training constrains the potential of RAW object detection due to the domain gap between sRGB and RAW, prompting us to directly pre-train on the RAW domain. However, it is harder for RAW pre-training to learn rich representations than sRGB pre-training due to the camera noise. To assist RAW pre-training, we distill the knowledge from an off-the-shelf model pre-trained on the sRGB domain. As a result, we achieve substantial improvements under diverse and adverse conditions without relying on extra pre-processing modules. Code and dataset are available at https://github.com/lzyhha/AODRaw.

Towards RAW Object Detection in Diverse Conditions

TL;DR

This work introduces the AODRaw dataset, which offers 7,785 high-resolution real RAW images with 135,601 annotated instances spanning 62 categories, capturing a broad range of indoor and outdoor scenes under 9 distinct light and weather conditions, and provides a comprehensive benchmark for evaluating current detection methods.

Abstract

Existing object detection methods often consider sRGB input, which was compressed from RAW data using ISP originally designed for visualization. However, such compression might lose crucial information for detection, especially under complex light and weather conditions. We introduce the AODRaw dataset, which offers 7,785 high-resolution real RAW images with 135,601 annotated instances spanning 62 categories, capturing a broad range of indoor and outdoor scenes under 9 distinct light and weather conditions. Based on AODRaw that supports RAW and sRGB object detection, we provide a comprehensive benchmark for evaluating current detection methods. We find that sRGB pre-training constrains the potential of RAW object detection due to the domain gap between sRGB and RAW, prompting us to directly pre-train on the RAW domain. However, it is harder for RAW pre-training to learn rich representations than sRGB pre-training due to the camera noise. To assist RAW pre-training, we distill the knowledge from an off-the-shelf model pre-trained on the sRGB domain. As a result, we achieve substantial improvements under diverse and adverse conditions without relying on extra pre-processing modules. Code and dataset are available at https://github.com/lzyhha/AODRaw.

Paper Structure

This paper contains 21 sections, 2 equations, 11 figures, 10 tables.

Figures (11)

  • Figure 1: (a) Traditional sRGB-based object detection relies on 8-bit sRGB images, which are compressed from RAW images and lose detailed information. (b) Previous RAW-based methods utilize a trainable image signal processor (ISP) to adapt models pre-trained on the sRGB domain to the RAW domain. (c) We pre-train models on the RAW domain, achieving excellent performance on RAW object detection without requiring ISP modules.
  • Figure 2: Example of the images in the AODRaw. From top to bottom, we show daylight, low-light, rain, and fog conditions, respectively. A part of the images are taken under multiple conditions. For example, the first one in the third row is taken in low-light and rain conditions. More examples for each condition can be found in the supplementary material.
  • Figure 3: Statistics indicate that our AODRaw dataset contains increased category and instance diversity.
  • Figure 4: The distribution of object centers.
  • Figure 5: Top-1 accuracy on ImageNet-RAW when synthesizing RAW images under different average brightness. The maximum average brightness for an image is $2^{16}$.
  • ...and 6 more figures