Free Lunch to Meet the Gap: Intermediate Domain Reconstruction for Cross-Domain Few-Shot Learning
Tong Zhang, Yifan Zhao, Liangyu Wang, Jia Li
TL;DR
This work tackles Cross-Domain Few-Shot Learning by introducing Intermediate Domain Proxies (IDP) that reconstruct target-domain features from a source-derived codebook, enabling fast, data-efficient domain alignment through normalization-layer transformations. The approach combines dense feature reconstruction with a clustered intermediate proxy pool and a BN-statistics-based alignment mechanism, optimized via a triad of losses and trained in three stages (source pretraining, target finetuning, and target inference). The method demonstrates state-of-the-art performance across eight cross-domain benchmarks and provides theoretical and empirical evidence that intermediate proxies reduce semantic gap and improve target generalization. Practically, IDP offers a “free-lunch” style adaptation that minimizes additional data and computational burden during deployment while delivering robust cross-domain transfer in few-shot settings.
Abstract
Cross-Domain Few-Shot Learning (CDFSL) endeavors to transfer generalized knowledge from the source domain to target domains using only a minimal amount of training data, which faces a triplet of learning challenges in the meantime, i.e., semantic disjoint, large domain discrepancy, and data scarcity. Different from predominant CDFSL works focused on generalized representations, we make novel attempts to construct Intermediate Domain Proxies (IDP) with source feature embeddings as the codebook and reconstruct the target domain feature with this learned codebook. We then conduct an empirical study to explore the intrinsic attributes from perspectives of visual styles and semantic contents in intermediate domain proxies. Reaping benefits from these attributes of intermediate domains, we develop a fast domain alignment method to use these proxies as learning guidance for target domain feature transformation. With the collaborative learning of intermediate domain reconstruction and target feature transformation, our proposed model is able to surpass the state-of-the-art models by a margin on 8 cross-domain few-shot learning benchmarks.
