Label-template based Few-Shot Text Classification with Contrastive Learning
Guanghua Hou, Shuhui Cao, Deqiang Ouyang, Ning Wang
TL;DR
This work tackles few-shot text classification by addressing limitations of prototype-based meta-learning, notably its reliance on inter-class variance and sensitivity to noise. The authors propose a label-template based framework that injects label semantics into input sentences, coupled with supervised contrastive learning and an attention-based prototype update to better capture intra- and inter-class structure. A multi-task objective combines prototype-based classification with contrastive learning, guided by label semantics, yielding notable improvements across four benchmarks and faster convergence, especially in 1-shot scenarios. The approach demonstrates the practical impact of leveraging label information and attention mechanisms to enhance discriminative text representations in low-resource settings.
Abstract
As an algorithmic framework for learning to learn, meta-learning provides a promising solution for few-shot text classification. However, most existing research fail to give enough attention to class labels. Traditional basic framework building meta-learner based on prototype networks heavily relies on inter-class variance, and it is easily influenced by noise. To address these limitations, we proposes a simple and effective few-shot text classification framework. In particular, the corresponding label templates are embed into input sentences to fully utilize the potential value of class labels, guiding the pre-trained model to generate more discriminative text representations through the semantic information conveyed by labels. With the continuous influence of label semantics, supervised contrastive learning is utilized to model the interaction information between support samples and query samples. Furthermore, the averaging mechanism is replaced with an attention mechanism to highlight vital semantic information. To verify the proposed scheme, four typical datasets are employed to assess the performance of different methods. Experimental results demonstrate that our method achieves substantial performance enhancements and outperforms existing state-of-the-art models on few-shot text classification tasks.
