Table of Contents
Fetching ...

Efficient Camera Exposure Control for Visual Odometry via Deep Reinforcement Learning

Shuyang Zhang, Jinhao He, Yilong Zhu, Jin Wu, Jie Yuan

TL;DR

This paper tackles the problem of visual odometry degradation under rapid illumination changes by introducing a deep reinforcement learning framework for camera exposure control trained entirely offline. It decouples exposure selection from parameter allocation via a two-module pipeline and employs a lightweight image simulator with bracketing-based photometric synthesis and motion augmentation to achieve data-efficient training. Three reward designs—statistical, feature-based, and pose-based—yield different VO-relevant intelligences, with the feature-based reward offering robust performance in challenging sequences and fast inference on CPU. The approach demonstrates improved VO stability and faster reaction times compared with traditional methods, highlighting practical potential for robust, exposure-aware VO in real-world robotics without online hardware interaction.

Abstract

The stability of visual odometry (VO) systems is undermined by degraded image quality, especially in environments with significant illumination changes. This study employs a deep reinforcement learning (DRL) framework to train agents for exposure control, aiming to enhance imaging performance in challenging conditions. A lightweight image simulator is developed to facilitate the training process, enabling the diversification of image exposure and sequence trajectory. This setup enables completely offline training, eliminating the need for direct interaction with camera hardware and the real environments. Different levels of reward functions are crafted to enhance the VO systems, equipping the DRL agents with varying intelligence. Extensive experiments have shown that our exposure control agents achieve superior efficiency-with an average inference duration of 1.58 ms per frame on a CPU-and respond more quickly than traditional feedback control schemes. By choosing an appropriate reward function, agents acquire an intelligent understanding of motion trends and anticipate future illumination changes. This predictive capability allows VO systems to deliver more stable and precise odometry results. The codes and datasets are available at https://github.com/ShuyangUni/drl_exposure_ctrl.

Efficient Camera Exposure Control for Visual Odometry via Deep Reinforcement Learning

TL;DR

This paper tackles the problem of visual odometry degradation under rapid illumination changes by introducing a deep reinforcement learning framework for camera exposure control trained entirely offline. It decouples exposure selection from parameter allocation via a two-module pipeline and employs a lightweight image simulator with bracketing-based photometric synthesis and motion augmentation to achieve data-efficient training. Three reward designs—statistical, feature-based, and pose-based—yield different VO-relevant intelligences, with the feature-based reward offering robust performance in challenging sequences and fast inference on CPU. The approach demonstrates improved VO stability and faster reaction times compared with traditional methods, highlighting practical potential for robust, exposure-aware VO in real-world robotics without online hardware interaction.

Abstract

The stability of visual odometry (VO) systems is undermined by degraded image quality, especially in environments with significant illumination changes. This study employs a deep reinforcement learning (DRL) framework to train agents for exposure control, aiming to enhance imaging performance in challenging conditions. A lightweight image simulator is developed to facilitate the training process, enabling the diversification of image exposure and sequence trajectory. This setup enables completely offline training, eliminating the need for direct interaction with camera hardware and the real environments. Different levels of reward functions are crafted to enhance the VO systems, equipping the DRL agents with varying intelligence. Extensive experiments have shown that our exposure control agents achieve superior efficiency-with an average inference duration of 1.58 ms per frame on a CPU-and respond more quickly than traditional feedback control schemes. By choosing an appropriate reward function, agents acquire an intelligent understanding of motion trends and anticipate future illumination changes. This predictive capability allows VO systems to deliver more stable and precise odometry results. The codes and datasets are available at https://github.com/ShuyangUni/drl_exposure_ctrl.
Paper Structure (26 sections, 12 equations, 6 figures, 3 tables)

This paper contains 26 sections, 12 equations, 6 figures, 3 tables.

Figures (6)

  • Figure 1: An illustration of drastic illumination change in our Corridor sequence. Our DRL-based method with feature-level rewards (DRL-feat) exhibits a high-level comprehension of lighting and motion, surpassing the traditional method (Built-in) and DRL method with image statistic-level rewards (DRL-stat). The agent DRL-feat predicts the impending over-exposure event and preemptively reduces the exposure. While this adjustment temporarily decreases the number of tracked feature points, it effectively prevents a more severe failure in subsequent frames.
  • Figure 2: System overview of the training and inference phases. In the training phase, we employ the image bracketing technique for simulation, enhanced by data augmentation to diversify sequence motion. The Soft Actor-Critic (SAC) framework was adopted for our DRL implementation. Within this framework, agents (actors) of varing intelligence levels were trained using distinct reward designs. In the inference phase, these trained agents generate a continuous relative action signal, which is then translated into target exposure for the next image. These control signals, comprising both exposure time and analog gain, are allocated via a rule-based strategy before transmission to the camera hardware. The newly captured image is subsequently fed back into the agent's input for ongoing inference.
  • Figure 3: Our image simulation environment for DRL training. The photometric synthesis module enables offline interaction between the agents and captured image sequences. The motion data augmentation significantly enhances the diversity of available trajectories.
  • Figure 4: Our experimental platform and training data scenarios in campus.
  • Figure 5: The experiment of the reaction speed with a switch case of light off then on. The method of Zhangzhang2024image performs the fastest because of its principle of one step control. All our DRL methods (DRL-stat, DRL-feat, and DRL-pose) respond faster than feedback control methods (Built-in and Shim).
  • ...and 1 more figures