SS-SFR: Synthetic Scenes Spatial Frequency Response on Virtual KITTI and Degraded Automotive Simulations for Object Detection
Daniel Jakab, Alexander Braun, Cathaoir Agnew, Reenu Mohandas, Brian Michael Deegan, Dara Molloy, Enda Ward, Tony Scanlan, Ciarán Eising
TL;DR
The paper addresses the lack of image-quality evaluation in automotive simulation and the impact of optical degradations on perception for autonomous driving. It introduces Synthetic Scenes Spatial Frequency Response (SS-SFR) by applying Gaussian blur to Virtual KITTI and measuring $MTF50$ via the Slanted Edge Method (ISO12233) with NS-SFR to isolate edge regions. Three detectors (Faster RCNN, YOLOF, DETR) are trained and evaluated on four variations, showing sharpness degrades from $MTF50$ ~ $0.245$ to ~ $0.119$ cy/px, while overall detection accuracy remains robust with small declines (≈0.58%, 1.45%, and 1.93%). This demonstrates that synthetic data with optical degradations can still support reliable object detection and points to future work incorporating more realistic degradations and other perception tasks to close the sim-to-real gap.
Abstract
Automotive simulation can potentially compensate for a lack of training data in computer vision applications. However, there has been little to no image quality evaluation of automotive simulation and the impact of optical degradations on simulation is little explored. In this work, we investigate Virtual KITTI and the impact of applying variations of Gaussian blur on image sharpness. Furthermore, we consider object detection, a common computer vision application on three different state-of-the-art models, thus allowing us to characterize the relationship between object detection and sharpness. It was found that while image sharpness (MTF50) degrades from an average of 0.245cy/px to approximately 0.119cy/px; object detection performance stays largely robust within 0.58\%(Faster RCNN), 1.45\%(YOLOF) and 1.93\%(DETR) across all respective held-out test sets.
