Multi-Source Domain Adaptation for Object Detection with Prototype-based Mean-teacher
Atif Belal, Akhil Meethal, Francisco Perdigon Romero, Marco Pedersoli, Eric Granger
TL;DR
This work tackles MSDA for object detection by introducing Prototype-based Mean Teacher (PMT), a memory-efficient method that encodes domain-specific information with class prototypes rather than per-source subnets. It combines a mean-teacher framework for semi-supervised adaptation with a multi-domain discriminator for domain-invariant features and a prototype network using a contrastive loss to align same-class prototypes across domains while separating different classes. The approach yields strong improvements over state-of-the-art MSDA methods across cross-time, cross-camera, and mixed-domain benchmarks, while showing negligible parameter growth as the number of source domains increases. These results suggest PMT provides scalable, robust MSDA for OD with practical impact on real-world deployment where multiple, diverse source domains are available.
Abstract
Adapting visual object detectors to operational target domains is a challenging task, commonly achieved using unsupervised domain adaptation (UDA) methods. Recent studies have shown that when the labeled dataset comes from multiple source domains, treating them as separate domains and performing a multi-source domain adaptation (MSDA) improves the accuracy and robustness over blending these source domains and performing a UDA. For adaptation, existing MSDA methods learn domain-invariant and domain-specific parameters (for each source domain). However, unlike single-source UDA methods, learning domain-specific parameters makes them grow significantly in proportion to the number of source domains. This paper proposes a novel MSDA method called Prototype-based Mean Teacher (PMT), which uses class prototypes instead of domain-specific subnets to encode domain-specific information. These prototypes are learned using a contrastive loss, aligning the same categories across domains and separating different categories far apart. Given the use of prototypes, the number of parameters required for our PMT method does not increase significantly with the number of source domains, thus reducing memory issues and possible overfitting. Empirical studies indicate that PMT outperforms state-of-the-art MSDA methods on several challenging object detection datasets. Our code is available at https://github.com/imatif17/Prototype-Mean-Teacher.
