Table of Contents
Fetching ...

Making the Flow Glow -- Robot Perception under Severe Lighting Conditions using Normalizing Flow Gradients

Simon Kristoffersson Lind, Rudolph Triebel, Volker Krüger

TL;DR

This paper tackles robust robotic perception under severe lighting by shift­ing from global to pixel-level out-of-distribution detection. It leverages the absolute gradients of a normalizing-flow-based likelihood to produce region-specific OOD cues, enabling ROI-guided optimization of camera parameters to improve object detection in challenging scenes. Empirical results show that the NF-gradient approach yields substantial performance gains (e.g., ~60% higher success than prior methods) and that gradient magnitude correlates with detection reliability across detectors like YOLOv4 and Faster-RCNN. The work demonstrates practical gains for adaptive vision in robotics, provides code and a dataset, and discusses runtime considerations and future extensions for broader applicability.

Abstract

Modern robotic perception is highly dependent on neural networks. It is well known that neural network-based perception can be unreliable in real-world deployment, especially in difficult imaging conditions. Out-of-distribution detection is commonly proposed as a solution for ensuring reliability in real-world deployment. Previous work has shown that normalizing flow models can be used for out-of-distribution detection to improve reliability of robotic perception tasks. Specifically, camera parameters can be optimized with respect to the likelihood output from a normalizing flow, which allows a perception system to adapt to difficult vision scenarios. With this work we propose to use the absolute gradient values from a normalizing flow, which allows the perception system to optimize local regions rather than the whole image. By setting up a table top picking experiment with exceptionally difficult lighting conditions, we show that our method achieves a 60% higher success rate for an object detection task compared to previous methods.

Making the Flow Glow -- Robot Perception under Severe Lighting Conditions using Normalizing Flow Gradients

TL;DR

This paper tackles robust robotic perception under severe lighting by shift­ing from global to pixel-level out-of-distribution detection. It leverages the absolute gradients of a normalizing-flow-based likelihood to produce region-specific OOD cues, enabling ROI-guided optimization of camera parameters to improve object detection in challenging scenes. Empirical results show that the NF-gradient approach yields substantial performance gains (e.g., ~60% higher success than prior methods) and that gradient magnitude correlates with detection reliability across detectors like YOLOv4 and Faster-RCNN. The work demonstrates practical gains for adaptive vision in robotics, provides code and a dataset, and discusses runtime considerations and future extensions for broader applicability.

Abstract

Modern robotic perception is highly dependent on neural networks. It is well known that neural network-based perception can be unreliable in real-world deployment, especially in difficult imaging conditions. Out-of-distribution detection is commonly proposed as a solution for ensuring reliability in real-world deployment. Previous work has shown that normalizing flow models can be used for out-of-distribution detection to improve reliability of robotic perception tasks. Specifically, camera parameters can be optimized with respect to the likelihood output from a normalizing flow, which allows a perception system to adapt to difficult vision scenarios. With this work we propose to use the absolute gradient values from a normalizing flow, which allows the perception system to optimize local regions rather than the whole image. By setting up a table top picking experiment with exceptionally difficult lighting conditions, we show that our method achieves a 60% higher success rate for an object detection task compared to previous methods.

Paper Structure

This paper contains 10 sections, 8 equations, 5 figures, 4 tables.

Figures (5)

  • Figure 1: In the top image, taken with default camera parameters and auto-exposure, only the cup can be detected by YOLOv4. In the bottom image, where camera parameters have been optimized using our proposed method, the cup, spoon, fork, scissors, and book are successfully detected.
  • Figure 2: Left: Images from the COCO validation set. right: Absolute gradient images from YOLOv4 + Normalizing Flow.
  • Figure 3: Illustration of how we apply NFs alongside a pretrained CNN for vision tasks.
  • Figure 4: Cluttered scene used in parameter optimization experiment.
  • Figure 5: Example images captured during experiments. Top left: default parameters. Top right: Default parameters with auto-exposure and auto-whitebalance. Bottom left: parameters optimized according to scia23. Bottom right: parameters optimized using our NF gradient image.