Robustness of Object Detection of Autonomous Vehicles in Adverse Weather Conditions
Fox Pettersen, Hong Zhu
TL;DR
This work addresses the challenge of safely operating autonomous-vehicle object detectors under adverse weather by introducing a formal notion of operational robustness and a data-augmentation–driven metric, the average first failure coefficient ($AFFC$). It conducts two–fold experiments: (i) cross-model robustness evaluation under seven weather/lighting conditions using 57,400 weather-augmented images to quantify thresholds of failure, with Faster R-CNN achieving the strongest overall robustness at $AFFC\approx71.9\%$ and YOLO variants around $43\%$; (ii) evaluation of synthetic weather training on OBRA models, where the mid-level synthetic training ($M_{2.5}$) yields the best robustness at $AFFC\approx89.6\%$, though further training shows diminishing returns and potential forgetting. The proposed AFFC-based framework enables precise safe-operation thresholds (e.g., fog intensity limits) and comparative analysis across architectures, informing model development and deployment decisions in adverse conditions. Overall, the work demonstrates feasibility and efficiency of robustness evaluation and highlights the nuanced effects of weather-focused training on sustained reliability in real-world autonomous driving scenarios.
Abstract
As self-driving technology advances toward widespread adoption, determining safe operational thresholds across varying environmental conditions becomes critical for public safety. This paper proposes a method for evaluating the robustness of object detection ML models in autonomous vehicles under adverse weather conditions. It employs data augmentation operators to generate synthetic data that simulates different severance degrees of the adverse operation conditions at progressive intensity levels to find the lowest intensity of the adverse conditions at which the object detection model fails. The robustness of the object detection model is measured by the average first failure coefficients (AFFC) over the input images in the benchmark. The paper reports an experiment with four object detection models: YOLOv5s, YOLOv11s, Faster R-CNN, and Detectron2, utilising seven data augmentation operators that simulate weather conditions fog, rain, and snow, and lighting conditions of dark, bright, flaring, and shadow. The experiment data show that the method is feasible, effective, and efficient to evaluate and compare the robustness of object detection models in various adverse operation conditions. In particular, the Faster R-CNN model achieved the highest robustness with an overall average AFFC of 71.9% over all seven adverse conditions, while YOLO variants showed the AFFC values of 43%. The method is also applied to assess the impact of model training that targets adverse operation conditions using synthetic data on model robustness. It is observed that such training can improve robustness in adverse conditions but may suffer from diminishing returns and forgetting phenomena (i.e., decline in robustness) if overtrained.
