Predictive Inequity in Object Detection
Benjamin Wilson, Judy Hoffman, Jamie Morgenstern
TL;DR
The paper examines predictive inequity in pedestrian detection across Fitzpatrick skin-tone groups (LS vs DS) within driving datasets, highlighting consistent underperformance for DS across multiple models and training regimes. It establishes a benchmark using BDD100K with annotated skin tones, defines a loss-based inequity metric, and analyzes potential sources such as occlusion, time of day, and loss prioritization. Key findings show that LS generally achieve higher AP, especially AP75, across architectures like Faster R-CNN and Mask R-CNN, and that simple loss reweighting can partially mitigate the gap. The work underscores the importance of fairness considerations in safety-critical vision systems and suggests that dataset and training adjustments can reduce, but not fully eliminate, predictive inequity, prompting broader strategies for equitable autonomous driving perception.
Abstract
In this work, we investigate whether state-of-the-art object detection systems have equitable predictive performance on pedestrians with different skin tones. This work is motivated by many recent examples of ML and vision systems displaying higher error rates for certain demographic groups than others. We annotate an existing large scale dataset which contains pedestrians, BDD100K, with Fitzpatrick skin tones in ranges [1-3] or [4-6]. We then provide an in-depth comparative analysis of performance between these two skin tone groupings, finding that neither time of day nor occlusion explain this behavior, suggesting this disparity is not merely the result of pedestrians in the 4-6 range appearing in more difficult scenes for detection. We investigate to what extent time of day, occlusion, and reweighting the supervised loss during training affect this predictive bias.
