Attention-based Class-Conditioned Alignment for Multi-Source Domain Adaptation of Object Detectors
Atif Belal, Akhil Meethal, Francisco Perdigon Romero, Marco Pedersoli, Eric Granger
TL;DR
Domain shifts hamper object detectors, and multi-source domain adaptation (MSDA) offers a remedy by leveraging multiple labeled sources and unlabeled targets. The paper introduces ACIA, an attention-based class-conditioned alignment method that integrates class information into ROI-pooled instance features via a transformer-style attention block within a Mean-Teacher framework, coupled with image-level multi-class and instance-level discriminators trained through gradient reversal. Results on cross-time, cross-camera, and mixed-domain MSDA benchmarks show state-of-the-art performance and robustness to class imbalance, outperforming prototype-based class-conditioned methods while avoiding pseudo-label accumulation issues. The approach is parameter-efficient (no domain-specific parameters) and achieves strong practical impact for robust multi-source object detection.
Abstract
Domain adaptation methods for object detection (OD) strive to mitigate the impact of distribution shifts by promoting feature alignment across source and target domains. Multi-source domain adaptation (MSDA) allows leveraging multiple annotated source datasets and unlabeled target data to improve the accuracy and robustness of the detection model. Most state-of-the-art MSDA methods for OD perform feature alignment in a class-agnostic manner. This is challenging since the objects have unique modality information due to variations in object appearance across domains. A recent prototype-based approach proposed a class-wise alignment, yet it suffers from error accumulation caused by noisy pseudo-labels that can negatively affect adaptation with imbalanced data. To overcome these limitations, we propose an attention-based class-conditioned alignment method for MSDA, designed to align instances of each object category across domains. In particular, an attention module combined with an adversarial domain classifier allows learning domain-invariant and class-specific instance representations. Experimental results on multiple benchmarking MSDA datasets indicate that our method outperforms state-of-the-art methods and exhibits robustness to class imbalance, achieved through a conceptually simple class-conditioning strategy. Our code is available at: https://github.com/imatif17/ACIA.
