Table of Contents
Fetching ...

Learn from the Learnt: Source-Free Active Domain Adaptation via Contrastive Sampling and Visual Persistence

Mengyao Lyu, Tianxiang Hao, Xinhao Xu, Hui Chen, Zijia Lin, Jungong Han, Guiguang Ding

TL;DR

SFADA addresses domain adaptation when source data is unavailable and target labels are scarce. The authors propose Learn from the Learnt (LFTL), which alternates Contrastive Active Sampling (CAS) and Visual Persistence-guided Adaptation (VPA) to learn from learnt hypotheses and anchors without extra overhead. CAS prioritizes target samples that are informative to the current model and remain challenging across rounds, while VPA preserves and leverages memory of active anchors to guide alignment in the target domain. Experiments on VisDA-C, Office-Home, and Office-31 demonstrate state-of-the-art performance under low annotation budgets and clear continual improvements as the budget increases, with notably higher efficiency than competing SFUDA and ADA methods. These results indicate a practical and scalable solution for real-world domain adaptation under data protection constraints.

Abstract

Domain Adaptation (DA) facilitates knowledge transfer from a source domain to a related target domain. This paper investigates a practical DA paradigm, namely Source data-Free Active Domain Adaptation (SFADA), where source data becomes inaccessible during adaptation, and a minimum amount of annotation budget is available in the target domain. Without referencing the source data, new challenges emerge in identifying the most informative target samples for labeling, establishing cross-domain alignment during adaptation, and ensuring continuous performance improvements through the iterative query-and-adaptation process. In response, we present learn from the learnt (LFTL), a novel paradigm for SFADA to leverage the learnt knowledge from the source pretrained model and actively iterated models without extra overhead. We propose Contrastive Active Sampling to learn from the hypotheses of the preceding model, thereby querying target samples that are both informative to the current model and persistently challenging throughout active learning. During adaptation, we learn from features of actively selected anchors obtained from previous intermediate models, so that the Visual Persistence-guided Adaptation can facilitate feature distribution alignment and active sample exploitation. Extensive experiments on three widely-used benchmarks show that our LFTL achieves state-of-the-art performance, superior computational efficiency and continuous improvements as the annotation budget increases. Our code is available at https://github.com/lyumengyao/lftl.

Learn from the Learnt: Source-Free Active Domain Adaptation via Contrastive Sampling and Visual Persistence

TL;DR

SFADA addresses domain adaptation when source data is unavailable and target labels are scarce. The authors propose Learn from the Learnt (LFTL), which alternates Contrastive Active Sampling (CAS) and Visual Persistence-guided Adaptation (VPA) to learn from learnt hypotheses and anchors without extra overhead. CAS prioritizes target samples that are informative to the current model and remain challenging across rounds, while VPA preserves and leverages memory of active anchors to guide alignment in the target domain. Experiments on VisDA-C, Office-Home, and Office-31 demonstrate state-of-the-art performance under low annotation budgets and clear continual improvements as the budget increases, with notably higher efficiency than competing SFUDA and ADA methods. These results indicate a practical and scalable solution for real-world domain adaptation under data protection constraints.

Abstract

Domain Adaptation (DA) facilitates knowledge transfer from a source domain to a related target domain. This paper investigates a practical DA paradigm, namely Source data-Free Active Domain Adaptation (SFADA), where source data becomes inaccessible during adaptation, and a minimum amount of annotation budget is available in the target domain. Without referencing the source data, new challenges emerge in identifying the most informative target samples for labeling, establishing cross-domain alignment during adaptation, and ensuring continuous performance improvements through the iterative query-and-adaptation process. In response, we present learn from the learnt (LFTL), a novel paradigm for SFADA to leverage the learnt knowledge from the source pretrained model and actively iterated models without extra overhead. We propose Contrastive Active Sampling to learn from the hypotheses of the preceding model, thereby querying target samples that are both informative to the current model and persistently challenging throughout active learning. During adaptation, we learn from features of actively selected anchors obtained from previous intermediate models, so that the Visual Persistence-guided Adaptation can facilitate feature distribution alignment and active sample exploitation. Extensive experiments on three widely-used benchmarks show that our LFTL achieves state-of-the-art performance, superior computational efficiency and continuous improvements as the annotation budget increases. Our code is available at https://github.com/lyumengyao/lftl.
Paper Structure (22 sections, 9 equations, 7 figures, 8 tables)

This paper contains 22 sections, 9 equations, 7 figures, 8 tables.

Figures (7)

  • Figure 1: (L) Source data-Free Active Domain Adaptation (SFADA) paradigm. Note that the ratio of labeled/unlabeled targets is much lower ($\le 1\%$ or $\le 5\%$ in our experiments) than in the illustration. (R) Performance comparison among DA SOTAs of different settings on Office-31.
  • Figure 2: The proposed LFTL framework for SFADA. Contrastive Active Sampling emphasizes freshly acquired knowledge in the posterior distribution so that novel samples are more likely to be queried for annotations. Then during adaptation the persistence vault retains previous domain-invariant knowledge to facilitate alignment in the target domain via $\mathcal{L}_{vpa}$ and $\mathcal{L}_{ent}$. As the iterative process continues, it yields more informative targets and an improved target model.
  • Figure 3: Results on the Office-31 dataset (ResNet50) in terms of classification accuracy (%). SF represents source inavailability, and AS shows percentage of active annotations (%). Best results are highlighted in bold, and the best in each section are underlined.
  • Figure 4: Comparison on complexity and actual query time with ADA methods. Notations are explained in text.
  • Figure 5: t-SNE visualization of unlabeled target samples (colored by classes) and actively queried samples (marked by cross) on VisDA-C with 0.1% budget per round.
  • ...and 2 more figures