Leveraging Confident Image Regions for Source-Free Domain-Adaptive Object Detection
Mohamed Lamine Mekhalfi, Davide Boscaini, Fabio Poiesi
TL;DR
This work tackles source-free domain-adaptive object detection by introducing SF-DACA, a data-augmentation-driven framework that selects confident regions from target images, constructs composite challenging samples, and self-trains the detector while preserving source knowledge through a teacher-student setup. Central to the method is a four-step pipeline (detect, augment, compose, adapt) that leverages regional pseudo-labels and a consistency objective, with an exponential moving-average teacher guiding the adaptation to prevent collapse. Empirical results across Cityscapes, FoggyCityscapes, Sim10K, and KITTI demonstrate state-of-the-art performance on two of three traffic-domain benchmarks, with detailed ablations confirming the importance of region selection thresholds and the DACA augmentation. The approach provides a practical and scalable solution for SF-UDA in object detection, with potential extensions to zero-shot grounding to further curb false positives and domain misalignment.
Abstract
Source-free domain-adaptive object detection is an interesting but scarcely addressed topic. It aims at adapting a source-pretrained detector to a distinct target domain without resorting to source data during adaptation. So far, there is no data augmentation scheme tailored to source-free domain-adaptive object detection. To this end, this paper presents a novel data augmentation approach that cuts out target image regions where the detector is confident, augments them along with their respective pseudo-labels, and joins them into a challenging target image to adapt the detector. As the source data is out of reach during adaptation, we implement our approach within a teacher-student learning paradigm to ensure that the model does not collapse during the adaptation procedure. We evaluated our approach on three adaptation benchmarks of traffic scenes, scoring new state-of-the-art on two of them.
