Table of Contents
Fetching ...

Evaluation of autonomous systems under data distribution shifts

Daniel Sikar, Artur Garcez

TL;DR

This work tackles safety under data distribution shifts for autonomous perception by introducing distance-based safety thresholds between training and testing data. It combines a Unity-based driving dataset (SDSandbox) with pixel-intensity RGB shifts and RGB→YUV preprocessing to quantify how shifts affect predictive accuracy, and evaluates multiple error and distribution-distance metrics. The study finds that in RGB space, simple histogram-based distances can robustly indicate safe operation with a practical threshold (e.g., Histogram Intersection around 0.40) and a clear safe-shift window near ±$40$ pixels; YUV-based distances tend to scale differently (exponential), complicating thresholds. The proposed P_safe rule and preference for fast RGB histogram metrics offer a practical, real-time mechanism to halt or hand control to humans when distribution shifts exceed a defined safety boundary, with implications for deploying autonomous systems in changing environments.

Abstract

We posit that data can only be safe to use up to a certain threshold of the data distribution shift, after which control must be relinquished by the autonomous system and operation halted or handed to a human operator. With the use of a computer vision toy example we demonstrate that network predictive accuracy is impacted by data distribution shifts and propose distance metrics between training and testing data to define safe operation limits within said shifts. We conclude that beyond an empirically obtained threshold of the data distribution shift, it is unreasonable to expect network predictive accuracy not to degrade

Evaluation of autonomous systems under data distribution shifts

TL;DR

This work tackles safety under data distribution shifts for autonomous perception by introducing distance-based safety thresholds between training and testing data. It combines a Unity-based driving dataset (SDSandbox) with pixel-intensity RGB shifts and RGB→YUV preprocessing to quantify how shifts affect predictive accuracy, and evaluates multiple error and distribution-distance metrics. The study finds that in RGB space, simple histogram-based distances can robustly indicate safe operation with a practical threshold (e.g., Histogram Intersection around 0.40) and a clear safe-shift window near ± pixels; YUV-based distances tend to scale differently (exponential), complicating thresholds. The proposed P_safe rule and preference for fast RGB histogram metrics offer a practical, real-time mechanism to halt or hand control to humans when distribution shifts exceed a defined safety boundary, with implications for deploying autonomous systems in changing environments.

Abstract

We posit that data can only be safe to use up to a certain threshold of the data distribution shift, after which control must be relinquished by the autonomous system and operation halted or handed to a human operator. With the use of a computer vision toy example we demonstrate that network predictive accuracy is impacted by data distribution shifts and propose distance metrics between training and testing data to define safe operation limits within said shifts. We conclude that beyond an empirically obtained threshold of the data distribution shift, it is unreasonable to expect network predictive accuracy not to degrade
Paper Structure (13 sections, 17 equations, 10 figures, 4 tables)

This paper contains 13 sections, 17 equations, 10 figures, 4 tables.

Figures (10)

  • Figure 1: Simulated accident in the CARLA Simulator Town 10, where excessive brightness i.e. high RGB values cause predictive accuracy of self-driving models to degrade
  • Figure 2: LeNet predictive accuracy change under rotation (left) and translation shift (right)
  • Figure 3: Two sets of images and pixel intensity value histograms, where the set on the left is a negative shift (pixel intensity values decrease) and the set on the right is a positive shift (pixel intensity values increase)
  • Figure 4: Left to right, the SDSandbox self-driving neural network training application, the Generated Track circuit, a steering angle histogram showing the distribution of steering angles when going around the track clockwise
  • Figure 5: Plots of ground truth (SDSandbox PID steering output) and nvidia2 network predictions for images with pixel value intensity shifts of 40, 80 and 120, where the steering error (st. err.) for RGB shift is the MAE
  • ...and 5 more figures