AdaptiveISP: Learning an Adaptive Image Signal Processor for Object Detection
Yujin Wang, Tianyi Xu, Fan Zhang, Tianfan Xue, Jinwei Gu
TL;DR
This work addresses optimizing the Image Signal Processor (ISP) for downstream object detection rather than merely maximizing image quality. It introduces AdaptiveISP, which models ISP configuration as a Markov Decision Process and learns a greedy, stage-by-stage policy over differentiable ISP modules with a detector-guided reward and penalties for reuse and latency. Through an actor-critic RL setup and integration of a pre-trained detector (YOLO-v3), AdaptiveISP achieves state-of-the-art detection performance across multiple datasets while enabling real-time adaptation to scene dynamics and adjustable efficiency. The approach yields practical insights into which ISP modules most influence detection and how to balance accuracy and computational cost in dynamic environments.
Abstract
Image Signal Processors (ISPs) convert raw sensor signals into digital images, which significantly influence the image quality and the performance of downstream computer vision tasks. Designing ISP pipeline and tuning ISP parameters are two key steps for building an imaging and vision system. To find optimal ISP configurations, recent works use deep neural networks as a proxy to search for ISP parameters or ISP pipelines. However, these methods are primarily designed to maximize the image quality, which are sub-optimal in the performance of high-level computer vision tasks such as detection, recognition, and tracking. Moreover, after training, the learned ISP pipelines are mostly fixed at the inference time, whose performance degrades in dynamic scenes. To jointly optimize ISP structures and parameters, we propose AdaptiveISP, a task-driven and scene-adaptive ISP. One key observation is that for the majority of input images, only a few processing modules are needed to improve the performance of downstream recognition tasks, and only a few inputs require more processing. Based on this, AdaptiveISP utilizes deep reinforcement learning to automatically generate an optimal ISP pipeline and the associated ISP parameters to maximize the detection performance. Experimental results show that AdaptiveISP not only surpasses the prior state-of-the-art methods for object detection but also dynamically manages the trade-off between detection performance and computational cost, especially suitable for scenes with large dynamic range variations. Project website: https://openimaginglab.github.io/AdaptiveISP/.
