Table of Contents
Fetching ...

Label-template based Few-Shot Text Classification with Contrastive Learning

Guanghua Hou, Shuhui Cao, Deqiang Ouyang, Ning Wang

TL;DR

This work tackles few-shot text classification by addressing limitations of prototype-based meta-learning, notably its reliance on inter-class variance and sensitivity to noise. The authors propose a label-template based framework that injects label semantics into input sentences, coupled with supervised contrastive learning and an attention-based prototype update to better capture intra- and inter-class structure. A multi-task objective combines prototype-based classification with contrastive learning, guided by label semantics, yielding notable improvements across four benchmarks and faster convergence, especially in 1-shot scenarios. The approach demonstrates the practical impact of leveraging label information and attention mechanisms to enhance discriminative text representations in low-resource settings.

Abstract

As an algorithmic framework for learning to learn, meta-learning provides a promising solution for few-shot text classification. However, most existing research fail to give enough attention to class labels. Traditional basic framework building meta-learner based on prototype networks heavily relies on inter-class variance, and it is easily influenced by noise. To address these limitations, we proposes a simple and effective few-shot text classification framework. In particular, the corresponding label templates are embed into input sentences to fully utilize the potential value of class labels, guiding the pre-trained model to generate more discriminative text representations through the semantic information conveyed by labels. With the continuous influence of label semantics, supervised contrastive learning is utilized to model the interaction information between support samples and query samples. Furthermore, the averaging mechanism is replaced with an attention mechanism to highlight vital semantic information. To verify the proposed scheme, four typical datasets are employed to assess the performance of different methods. Experimental results demonstrate that our method achieves substantial performance enhancements and outperforms existing state-of-the-art models on few-shot text classification tasks.

Label-template based Few-Shot Text Classification with Contrastive Learning

TL;DR

This work tackles few-shot text classification by addressing limitations of prototype-based meta-learning, notably its reliance on inter-class variance and sensitivity to noise. The authors propose a label-template based framework that injects label semantics into input sentences, coupled with supervised contrastive learning and an attention-based prototype update to better capture intra- and inter-class structure. A multi-task objective combines prototype-based classification with contrastive learning, guided by label semantics, yielding notable improvements across four benchmarks and faster convergence, especially in 1-shot scenarios. The approach demonstrates the practical impact of leveraging label information and attention mechanisms to enhance discriminative text representations in low-resource settings.

Abstract

As an algorithmic framework for learning to learn, meta-learning provides a promising solution for few-shot text classification. However, most existing research fail to give enough attention to class labels. Traditional basic framework building meta-learner based on prototype networks heavily relies on inter-class variance, and it is easily influenced by noise. To address these limitations, we proposes a simple and effective few-shot text classification framework. In particular, the corresponding label templates are embed into input sentences to fully utilize the potential value of class labels, guiding the pre-trained model to generate more discriminative text representations through the semantic information conveyed by labels. With the continuous influence of label semantics, supervised contrastive learning is utilized to model the interaction information between support samples and query samples. Furthermore, the averaging mechanism is replaced with an attention mechanism to highlight vital semantic information. To verify the proposed scheme, four typical datasets are employed to assess the performance of different methods. Experimental results demonstrate that our method achieves substantial performance enhancements and outperforms existing state-of-the-art models on few-shot text classification tasks.

Paper Structure

This paper contains 19 sections, 15 equations, 3 figures, 4 tables.

Figures (3)

  • Figure 1: The structure of our method. Eight instances from two classes $a$ and $b$ are sampled to form $\mathcal{S}$ and $\mathcal{Q}$. ${v}^{{s,q}}_{ci}$ is the text representation. In the prototype network, a fully connected layer $g(\cdot)$ produces the attention-awared representation ${{v}'}$. The weight $\gamma_i$ is applied to the $i$-th representation to obtain the class prototype for classification. $L_{pn}$ and $L_{con}$ are used for multi-task learning.
  • Figure 2: The normalized loss and accuracy of different approach with learning rate of 1e-6 and early stop strategy. The results are averaged and sampled during the testing process of HuffPost dataset.
  • Figure 3: The t-SNE figure for ContrastNet(a), LaSAML(b), Ours(c). 100 samples from 5 classes are sampled in one testing episode (N=5,K=1).