Enhancing Vehicle Detection under Adverse Weather Conditions with Contrastive Learning
Boying Li, Chang Liu, Petter Kyösti, Mattias Öhman, Devashish Singha Roy, Sofia Plazzi, Hamam Mokayed, Olle Hagner
TL;DR
This work tackles vehicle detection in UAV imagery under Nordic winter conditions where snow-induced domain shifts degrade performance. It introduces Sideload-Contrastive-Learning-Adaption (SCLA), a two-stage framework that pretrains a side CNN on unannotated data via Feature Map Patch-level Contrastive Learning (FM-PaCL) and then fuses its features with a frozen COCO-pretrained YOLO11n backbone using SE gating before detection heads. Empirical results on the Nordic Vehicle Dataset (NVD) show substantial improvements in $mAP_{50}$, with an $8.9\%$ gain under the NVD protocol and robust performance across alternative splits; ablations reveal that combining COCO pretraining with PaCL on unannotated data yields the strongest gains, while blockwise fusion can hinder performance. The approach enables improved, annotation-efficient vehicle detection suitable for edge devices, addressing domain gaps due to snow coverage and weather variability in Nordic UAV applications.
Abstract
Aside from common challenges in remote sensing like small, sparse targets and computation cost limitations, detecting vehicles from UAV images in the Nordic regions faces strong visibility challenges and domain shifts caused by diverse levels of snow coverage. Although annotated data are expensive, unannotated data is cheaper to obtain by simply flying the drones. In this work, we proposed a sideload-CL-adaptation framework that enables the use of unannotated data to improve vehicle detection using lightweight models. Specifically, we propose to train a CNN-based representation extractor through contrastive learning on the unannotated data in the pretraining stage, and then sideload it to a frozen YOLO11n backbone in the fine-tuning stage. To find a robust sideload-CL-adaptation, we conducted extensive experiments to compare various fusion methods and granularity. Our proposed sideload-CL-adaptation model improves the detection performance by 3.8% to 9.5% in terms of mAP50 on the NVD dataset.
