Table of Contents
Fetching ...

Large Margin Mechanism and Pseudo Query Set on Cross-Domain Few-Shot Learning

Jia-Fong Yeh, Hsin-Ying Lee, Bing-Chen Tsai, Yi-Rong Chen, Ping-Chia Huang, Winston H. Hsu

TL;DR

This work tackles cross-domain few-shot learning by introducing LMM-PQS, a fine-tuning framework that generates pseudo query images from support data and employs two margin-based losses to adapt pre-trained backbones to new domains with limited data. Central to the approach are the prototypical triplet loss, which enlarges inter-class margins using prototypes, and the large margin cosine loss, which tightens decision boundaries in the embedding space. The method leverages a pseudo query set to simulate the meta-testing scenario during fine-tuning and uses a cosine mean-centroid classifier for inference. Empirical results across four distinct domains show substantial improvements over baselines and demonstrate the robustness of the backbone across backbones and shot settings, highlighting the practical potential for cross-domain adaptation with minimal labeled data.

Abstract

In recent years, few-shot learning problems have received a lot of attention. While methods in most previous works were trained and tested on datasets in one single domain, cross-domain few-shot learning is a brand-new branch of few-shot learning problems, where models handle datasets in different domains between training and testing phases. In this paper, to solve the problem that the model is pre-trained (meta-trained) on a single dataset while fine-tuned on datasets in four different domains, including common objects, satellite images, and medical images, we propose a novel large margin fine-tuning method (LMM-PQS), which generates pseudo query images from support images and fine-tunes the feature extraction modules with a large margin mechanism inspired by methods in face recognition. According to the experiment results, LMM-PQS surpasses the baseline models by a significant margin and demonstrates that our approach is robust and can easily adapt pre-trained models to new domains with few data.

Large Margin Mechanism and Pseudo Query Set on Cross-Domain Few-Shot Learning

TL;DR

This work tackles cross-domain few-shot learning by introducing LMM-PQS, a fine-tuning framework that generates pseudo query images from support data and employs two margin-based losses to adapt pre-trained backbones to new domains with limited data. Central to the approach are the prototypical triplet loss, which enlarges inter-class margins using prototypes, and the large margin cosine loss, which tightens decision boundaries in the embedding space. The method leverages a pseudo query set to simulate the meta-testing scenario during fine-tuning and uses a cosine mean-centroid classifier for inference. Empirical results across four distinct domains show substantial improvements over baselines and demonstrate the robustness of the backbone across backbones and shot settings, highlighting the practical potential for cross-domain adaptation with minimal labeled data.

Abstract

In recent years, few-shot learning problems have received a lot of attention. While methods in most previous works were trained and tested on datasets in one single domain, cross-domain few-shot learning is a brand-new branch of few-shot learning problems, where models handle datasets in different domains between training and testing phases. In this paper, to solve the problem that the model is pre-trained (meta-trained) on a single dataset while fine-tuned on datasets in four different domains, including common objects, satellite images, and medical images, we propose a novel large margin fine-tuning method (LMM-PQS), which generates pseudo query images from support images and fine-tunes the feature extraction modules with a large margin mechanism inspired by methods in face recognition. According to the experiment results, LMM-PQS surpasses the baseline models by a significant margin and demonstrates that our approach is robust and can easily adapt pre-trained models to new domains with few data.

Paper Structure

This paper contains 20 sections, 1 equation, 7 figures, 4 tables.

Figures (7)

  • Figure 1: Illustration of general and cross-domain few-shot learning problem. Few-shot learning problem is usually under meta-learning scenario, which includes meta-training and meta-testing phase. Models face a new task in every iteration. A task contains support set (few labeled data) and query set (unlabeled data). The goal of few-shot learning aims to infer the query set label with the guidance from support set. (a) In conventional few-shot learning, tasks in the meta-training phase are sampled from base classes and task in the meta-testing phase are sampled from novel classes. The two classes are disjoint, but they might be similar to each other (e.g. two subsets from the same dataset). (b) In cross-domain few-shot learning, base classes and novel classes could be in an absolutely different domain. For example, base classes are sampled from common object images, and novel classes are sampled from satellite images. (cf. Section \ref{['section_problem_definition']})
  • Figure 2: Overview of large margin mechanism and pseudo query set (LMM-PQS). Meta-testing phase consists of two stages, fine-tuning and inference. At fine-tuning stage, LMM-PQS first generates pseudo query set (PQS, cf. Section \ref{['section_pqs']}) and uses trained backbone to compute the feature embeddings. After calculating the class prototypes, LMM-PQS applies PT loss (cf. Section \ref{['PT \n loss']}) to enlarge the inter-class distance and decrease the intra-class distance. Moreover, LMM-PQS also executes LMCL (a large margin mechanism, cf. Section \ref{['section_lmcl']}) to enhance the inferring confidence. At inference stage, the progress before prototype calculation is the same as at fine-tuning stage. Then a cosine mean-centroid classifier compares the cosine similarity between query sample and class prototypes to categorize the class. See Section \ref{['methodology']} and Figure \ref{['fig_pqs']}-\ref{['fig_lmm']} for detailed description.
  • Figure 3: Pseudo query set (PQS). We use support samples (solid circles) to generate the pseudo query samples (hollow circles) by several digital image processing operations (cf. Section \ref{['section_pqs']}). With PQS, we can discover more feature space with a few target data and learn more knowledge. Most importantly, few-shot models with PQS can fine-tune their parameter using the same way as in the mete-training phase. PQS has a benefit to assist few-shot models adapt to novel tasks.
  • Figure 4: Illustration of PT loss. In PT loss, a sample $s_{i}$ (anchor) is pulled toward its class prototype $p_{c_{s_{i}}}$ (positive) and pushed away from other class prototypes $p_{j}$ (negative). Compared to triplet loss Schroff15Triplet, we can obtain a more comprehensive margin between each classes, because the sample is operated with prototypes (cf. Section \ref{['PT loss']}).
  • Figure 5: Illustration of LMCL. The classifier needs to categorize a (pseudo) query sample $q_{i}$ to class 1 or 2. The $p_{1}$ and $p_{2}$ denote the prototype vectors of class 1 and class 2. (a) The cosine similarity classifier only compares the magnitude of cosine value and label $q_{i}$ as class 1, because $\cos\theta_{1}$ is bigger than $\cos\theta_{2}$. When two cosine values are close, there is no sufficient margin to categorize. (b) LMCL aims to increase the confidence when classifying the sample. LMCL subtracts a margin $m$ on $\cos\theta_{1}$ ($q_{i}$'s class) and then compares with $\cos\theta_{2}$. If the prediction is class 2, then LMCL knows that it doesn't have enough confidence on $q_{i}$. (c) After several fine-tuning iterations, the sample $q_{i}$ is forced to get closer to $p_{1}$ and becomes to $q^{'}$ so as to get larger cosine values . By applying LMCL, the class space of each classes are squeezed, and there is a margin between two classes. The margin is beneficial to classifiers, as they can recognize difference classes easily (cf. Section \ref{['section_lmcl']}).
  • ...and 2 more figures