Table of Contents
Fetching ...

Boosting Meta-Learning for Few-Shot Text Classification via Label-guided Distance Scaling

Yunlong Gao, Xinyue Liu, Yingbo Wang, Linlin Zong, Bo Xu

TL;DR

This work designs a label-guided loss to inject label semantic information, pulling closer the sample representations and corresponding label representations, and proposes a Label-guided Scaler which scales sample representations with label semantics to provide additional supervision signals.

Abstract

Few-shot text classification aims to recognize unseen classes with limited labeled text samples. Existing approaches focus on boosting meta-learners by developing complex algorithms in the training stage. However, the labeled samples are randomly selected during the testing stage, so they may not provide effective supervision signals, leading to misclassification. To address this issue, we propose a \textbf{L}abel-guided \textbf{D}istance \textbf{S}caling (LDS) strategy. The core of our method is exploiting label semantics as supervision signals in both the training and testing stages. Specifically, in the training stage, we design a label-guided loss to inject label semantic information, pulling closer the sample representations and corresponding label representations. In the testing stage, we propose a Label-guided Scaler which scales sample representations with label semantics to provide additional supervision signals. Thus, even if labeled sample representations are far from class centers, our Label-guided Scaler pulls them closer to their class centers, thereby mitigating the misclassification. We combine two common meta-learners to verify the effectiveness of the method. Extensive experimental results demonstrate that our approach significantly outperforms state-of-the-art models. All datasets and codes are available at https://anonymous.4open.science/r/Label-guided-Text-Classification.

Boosting Meta-Learning for Few-Shot Text Classification via Label-guided Distance Scaling

TL;DR

This work designs a label-guided loss to inject label semantic information, pulling closer the sample representations and corresponding label representations, and proposes a Label-guided Scaler which scales sample representations with label semantics to provide additional supervision signals.

Abstract

Few-shot text classification aims to recognize unseen classes with limited labeled text samples. Existing approaches focus on boosting meta-learners by developing complex algorithms in the training stage. However, the labeled samples are randomly selected during the testing stage, so they may not provide effective supervision signals, leading to misclassification. To address this issue, we propose a \textbf{L}abel-guided \textbf{D}istance \textbf{S}caling (LDS) strategy. The core of our method is exploiting label semantics as supervision signals in both the training and testing stages. Specifically, in the training stage, we design a label-guided loss to inject label semantic information, pulling closer the sample representations and corresponding label representations. In the testing stage, we propose a Label-guided Scaler which scales sample representations with label semantics to provide additional supervision signals. Thus, even if labeled sample representations are far from class centers, our Label-guided Scaler pulls them closer to their class centers, thereby mitigating the misclassification. We combine two common meta-learners to verify the effectiveness of the method. Extensive experimental results demonstrate that our approach significantly outperforms state-of-the-art models. All datasets and codes are available at https://anonymous.4open.science/r/Label-guided-Text-Classification.
Paper Structure (35 sections, 11 equations, 4 figures, 10 tables, 1 algorithm)

This paper contains 35 sections, 11 equations, 4 figures, 10 tables, 1 algorithm.

Figures (4)

  • Figure 1: Illustration of the problem in the testing stage, the feature space classification diagram of Prototypical Networks and the proposed LDS-PN. Picture (a) shows that, due to the support sample of class blue is at the boundary of class distribution, and $Q_1$ is misclassified as class orange because it is closest to the support sample of class orange. Picture (b) shows LDS-PN pulls support samples closer to the center of the corresponding classes by leveraging their label semantics (names), so $Q_1$ is classified class blue correctly.
  • Figure 2: The graph illustrates the Prompting and Feature Encoding method, Distance Scaling in the training stage, Label-guide Scaler in the testing stage, and Classification via Prototype Networks.
  • Figure 3: Visualization of text representations from HuffPost dataset, given by (a) LDS w/o LS and (b) LDS (ours). The pentagrams represent the support samples.
  • Figure 4: Visualization of sample text representations sampled from five novel classes on HuffPost dataset. The input representations are given by (a) Prototypical Networks (b) ProtoVerb (a boosting version of PN by prompting and contrastive learning) and (c) LDS (ours). The pentagrams represent the class prototypes.