Table of Contents
Fetching ...

CGSA: Class-Guided Slot-Aware Adaptation for Source-Free Object Detection

Boyang Dai, Zeng Fan, Zihao Qi, Meng Lou, Yizhou Yu

TL;DR

CGSA is presented, the first framework that brings Object-Centric Learning (OCL) into SF-DAOD by integrating slot-aware adaptation into the DETR-based detector, thereby indicating the promise of object-centric design in privacy-sensitive adaptation scenarios.

Abstract

Source-Free Domain Adaptive Object Detection (SF-DAOD) aims to adapt a detector trained on a labeled source domain to an unlabeled target domain without retaining any source data. Despite recent progress, most popular approaches focus on tuning pseudo-label thresholds or refining the teacher-student framework, while overlooking object-level structural cues within cross-domain data. In this work, we present CGSA, the first framework that brings Object-Centric Learning (OCL) into SF-DAOD by integrating slot-aware adaptation into the DETR-based detector. Specifically, our approach integrates a Hierarchical Slot Awareness (HSA) module into the detector to progressively disentangle images into slot representations that act as visual priors. These slots are then guided toward class semantics via a Class-Guided Slot Contrast (CGSC) module, maintaining semantic consistency and prompting domain-invariant adaptation. Extensive experiments on multiple cross-domain datasets demonstrate that our approach outperforms previous SF-DAOD methods, with theoretical derivations and experimental analysis further demonstrating the effectiveness of the proposed components and the framework, thereby indicating the promise of object-centric design in privacy-sensitive adaptation scenarios. Code is released at https://github.com/Michael-McQueen/CGSA.

CGSA: Class-Guided Slot-Aware Adaptation for Source-Free Object Detection

TL;DR

CGSA is presented, the first framework that brings Object-Centric Learning (OCL) into SF-DAOD by integrating slot-aware adaptation into the DETR-based detector, thereby indicating the promise of object-centric design in privacy-sensitive adaptation scenarios.

Abstract

Source-Free Domain Adaptive Object Detection (SF-DAOD) aims to adapt a detector trained on a labeled source domain to an unlabeled target domain without retaining any source data. Despite recent progress, most popular approaches focus on tuning pseudo-label thresholds or refining the teacher-student framework, while overlooking object-level structural cues within cross-domain data. In this work, we present CGSA, the first framework that brings Object-Centric Learning (OCL) into SF-DAOD by integrating slot-aware adaptation into the DETR-based detector. Specifically, our approach integrates a Hierarchical Slot Awareness (HSA) module into the detector to progressively disentangle images into slot representations that act as visual priors. These slots are then guided toward class semantics via a Class-Guided Slot Contrast (CGSC) module, maintaining semantic consistency and prompting domain-invariant adaptation. Extensive experiments on multiple cross-domain datasets demonstrate that our approach outperforms previous SF-DAOD methods, with theoretical derivations and experimental analysis further demonstrating the effectiveness of the proposed components and the framework, thereby indicating the promise of object-centric design in privacy-sensitive adaptation scenarios. Code is released at https://github.com/Michael-McQueen/CGSA.
Paper Structure (59 sections, 40 equations, 11 figures, 8 tables, 3 algorithms)

This paper contains 59 sections, 40 equations, 11 figures, 8 tables, 3 algorithms.

Figures (11)

  • Figure 1: Motivation of the proposed CGSA. (a) Most popular methods focus on filtering pseudo labels. (b) Our CGSA proposes a slot-aware framework to serve as a bridge for aligning common object-level structural features without accessing source data.
  • Figure 2: Framework of CGSA. It consists of two stages: source-domain pretraining and target-domain adaptation, which work collaboratively to improve SF-DAOD performance.
  • Figure 3: (a) Pipeline of the proposed HSA module. Through the hierarchical design, the features are first decomposed into coarse-to-fine slots. After projection, these slots are concatenated with the object queries to form slot-aware queries, thereby providing object-level structural priors. (b) Pipeline of the proposed CGSC module. By maintaining the global class prototype and assigning pseudo labels to weighted slots for class attributes, contrastive learning is used to implicitly supervise and guide the slots to focus on domain-invariant yet class-relevant object features.
  • Figure 4: Ablation on four cross-domain benchmarks. “Source Only” denotes the baseline without adaptation, where the source-trained model is directly tested on the target domain.
  • Figure 5: The t-SNE visualization comparing feature distributions of object queries on the Foggy-Cityscapes dataset between "Source Only" and our approach.
  • ...and 6 more figures