Source-Free Object Detection with Detection Transformer

Huizai Yao; Sicheng Zhao; Shuo Lu; Hui Chen; Yangyang Li; Guoping Liu; Tengfei Xing; Chenggang Yan; Jianhua Tao; Guiguang Ding

Source-Free Object Detection with Detection Transformer

Huizai Yao, Sicheng Zhao, Shuo Lu, Hui Chen, Yangyang Li, Guoping Liu, Tengfei Xing, Chenggang Yan, Jianhua Tao, Guiguang Ding

TL;DR

FRANCK presents a DETR-tailored source-free domain adaptation framework for object detection, integrating four components—Objectness Score-based Sample Reweighting (OSSR), Contrastive Learning with Matching-based Memory Bank (CMMB), Uncertainty-weighted Query-fused Feature Distillation (UQFD), and Dynamic Teacher Updating Interval (DTUI)—to achieve robust cross-domain transfer without access to source data. The approach targets category-, instance-, and feature-level alignment through a unified, query-centric design that leverages pseudo bipartite matching and memory banks to enable class-wise contrastive learning and reliable pseudo supervision. Extensive experiments across Cityscapes/Foggy Cityscapes, Sim10k→Cityscapes, and cross-dataset/scene/weather settings demonstrate state-of-the-art performance for DETR-based SFOD, highlighting improved discriminability, stability, and generalization. The work advances practical SFOD for DETR by exploiting DETR-specific structures and dynamic self-training, with potential extensions to multi-source domains and integration with vision foundation models.

Abstract

Source-Free Object Detection (SFOD) enables knowledge transfer from a source domain to an unsupervised target domain for object detection without access to source data. Most existing SFOD approaches are either confined to conventional object detection (OD) models like Faster R-CNN or designed as general solutions without tailored adaptations for novel OD architectures, especially Detection Transformer (DETR). In this paper, we introduce Feature Reweighting ANd Contrastive Learning NetworK (FRANCK), a novel SFOD framework specifically designed to perform query-centric feature enhancement for DETRs. FRANCK comprises four key components: (1) an Objectness Score-based Sample Reweighting (OSSR) module that computes attention-based objectness scores on multi-scale encoder feature maps, reweighting the detection loss to emphasize less-recognized regions; (2) a Contrastive Learning with Matching-based Memory Bank (CMMB) module that integrates multi-level features into memory banks, enhancing class-wise contrastive learning; (3) an Uncertainty-weighted Query-fused Feature Distillation (UQFD) module that improves feature distillation through prediction quality reweighting and query feature fusion; and (4) an improved self-training pipeline with a Dynamic Teacher Updating Interval (DTUI) that optimizes pseudo-label quality. By leveraging these components, FRANCK effectively adapts a source-pre-trained DETR model to a target domain with enhanced robustness and generalization. Extensive experiments on several widely used benchmarks demonstrate that our method achieves state-of-the-art performance, highlighting its effectiveness and compatibility with DETR-based SFOD models.

Source-Free Object Detection with Detection Transformer

TL;DR

Abstract

Source-Free Object Detection with Detection Transformer

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (7)