Deep Learning for Cross-Domain Few-Shot Visual Recognition: A Survey

Huali Xu; Shuaifeng Zhi; Shuzhou Sun; Vishal M. Patel; Li Liu

Deep Learning for Cross-Domain Few-Shot Visual Recognition: A Survey

Huali Xu, Shuaifeng Zhi, Shuzhou Sun, Vishal M. Patel, Li Liu

TL;DR

This survey formalizes cross-domain few-shot learning (CDFSL) by integrating few-shot adaptation with domain shifts and disjoint label spaces, and it positions CDFSL within the broader transfer-learning landscape. It introduces a TSERM-based view of the core challenge and presents a fourfold taxonomy—$$-Extension, $$-Constraint, $elta$-Adaptation, and Hybrid methods—to address the unreliability of empirical risk minimization across domains. The paper then details concrete techniques for each category, reviews datasets (e.g., Meta-Dataset, BSCD-FSL, FGCB), and analyzes performance across near and distant domain transfers, highlighting when each strategy excels. It also maps future directions—active learning, source-free CDFSL, and prompt-based adaptation—that could significantly advance practical cross-domain few-shot vision. Overall, CDFSL offers a principled framework to leverage abundant but differently distributed source data to empower learning in data-scarce, domain-shifted target scenarios with substantial real-world impact.

Abstract

While deep learning excels in computer vision tasks with abundant labeled data, its performance diminishes significantly in scenarios with limited labeled samples. To address this, Few-shot learning (FSL) enables models to perform the target tasks with very few labeled examples by leveraging prior knowledge from related tasks. However, traditional FSL assumes that both the related and target tasks come from the same domain, which is a restrictive assumption in many real-world scenarios where domain differences are common. To overcome this limitation, Cross-domain few-shot learning (CDFSL) has gained attention, as it allows source and target data to come from different domains and label spaces. This paper presents the first comprehensive review of Cross-domain Few-shot Learning (CDFSL), a field that has received less attention compared to traditional FSL due to its unique challenges. We aim to provide both a position paper and a tutorial for researchers, covering key problems, existing methods, and future research directions. The review begins with a formal definition of CDFSL, outlining its core challenges, followed by a systematic analysis of current approaches, organized under a clear taxonomy. Finally, we discuss promising future directions in terms of problem setups, applications, and theoretical advancements.

Deep Learning for Cross-Domain Few-Shot Visual Recognition: A Survey

TL;DR

-Extension,

-Constraint,

-Adaptation, and Hybrid methods—to address the unreliability of empirical risk minimization across domains. The paper then details concrete techniques for each category, reviews datasets (e.g., Meta-Dataset, BSCD-FSL, FGCB), and analyzes performance across near and distant domain transfers, highlighting when each strategy excels. It also maps future directions—active learning, source-free CDFSL, and prompt-based adaptation—that could significantly advance practical cross-domain few-shot vision. Overall, CDFSL offers a principled framework to leverage abundant but differently distributed source data to empower learning in data-scarce, domain-shifted target scenarios with substantial real-world impact.

Abstract

Paper Structure (40 sections, 7 equations, 12 figures, 5 tables)

This paper contains 40 sections, 7 equations, 12 figures, 5 tables.

Introduction
Background
Key Concepts
Problem Definition
Closely Related Problems
Unique Issue and Challenge
Empirical Risk Minimization (ERM) erm1erm2
Two-Stage Empirical Risk Minimization (TSERM)
Unique Issue and Challenge
Taxonomy
Approaches
$\mathcal{D}$-Extension
Data Augmentation.
Feature Generation
Task Synthesis
...and 25 more sections

Figures (12)

Figure 1: The difference of few-shot learning and cross-domain few-shot learning.
Figure 2: Chronological milestones of CDFSL from 2019 to the present, including representative CDFSL approaches and related benchmarks. Key events include the release of Meta-Dataset meta-dataset and BSCD-FSL bscd-fsl in 2020, the introduction of pioneering works such as feature-wise, and subsequent contributions like feature_reweight_1lscdfsl. Later works stdynamichybrid_1hybrid_4hybrid_2 explored new setups, while boostingatadata_target_1feature_reweight_5parameter_weight_2confessfeature_reweight_9 focused on improving performance. Please see Section \ref{['methods']} for details.
Figure 3: (a) the standard classification, (b) few-shot classification, (c) unsupervised domain adaptation, and (d) cross-domain few-shot classification. The different shapes represent different categories. $\mathcal{D}$ means domain, $\mathcal{D}^{s}$ and $\mathcal{D}^{t}$ specifically represent the source and target domains, respectively. Green and blue illustrate the source and target data. Gray represents the unlabeled test data, and '?' indicates predicting the test data. Dotted arrows indicate the adaptation process.
Figure 4: Comparison of (a) vanilla supervised learning, (b) few-shot learning (FSL), and (c) cross-domain few-shot learning (CDFSL). The square represents the hypothesis space $\mathcal{H}$. Solid circles denote the datasets (the size means the amount of data, i.e.$\mathcal{D}$), the large and small solid circles represent the auxiliary and limited target datasets, respectively. Dotted circles indicate the domain to which the target samples belongs, which means the auxiliary dataset is from the same domain with target domain in (b), and different but related domain in (c). The angle between the optimization directions of the two stages represents the difference between the source and target tasks $\Delta$, i.e., the larger the angle, the greater the difference.
Figure 5: The main taxonomy of cross-domain few-shot learning (CDFSL) methods: (a) $\mathcal{D}$-Extension, (b) $\mathcal{H}$-Constraint, and (c) $\Delta$-Adaptation.
...and 7 more figures

Theorems & Definitions (6)

Definition 2.1.1
Definition 2.1.2
Definition 2.2.1
Definition 2.2.2
Definition 2.2.3
Definition 2.2.4

Deep Learning for Cross-Domain Few-Shot Visual Recognition: A Survey

TL;DR

Abstract

Deep Learning for Cross-Domain Few-Shot Visual Recognition: A Survey

Authors

TL;DR

Abstract

Table of Contents

Figures (12)

Theorems & Definitions (6)