Table of Contents
Fetching ...

Label Distribution Learning-Enhanced Dual-KNN for Text Classification

Bo Yuan, Yulin Chen, Zhen Tan, Wang Jinyan, Huan Liu, Yin Zhang

TL;DR

This work tackles text classification by exploiting internal model information through a dual $k$NN (D$k$NN) framework that retrieves neighbors using both text embeddings and predicted label distributions. A label distribution learning (LL) module learns label similarity and uses contrastive learning to produce more discriminative label representations, improving both the base model and the quality of retrieved neighbors. Empirical results across five datasets show consistent accuracy gains and enhanced robustness to noisy labels, outperforming various baselines and ablations. The approach advances retrieval-augmented NLP by leveraging intermediate representations and label correlations to enable more robust and reliable predictions.

Abstract

Many text classification methods usually introduce external information (e.g., label descriptions and knowledge bases) to improve the classification performance. Compared to external information, some internal information generated by the model itself during training, like text embeddings and predicted label probability distributions, are exploited poorly when predicting the outcomes of some texts. In this paper, we focus on leveraging this internal information, proposing a dual $k$ nearest neighbor (D$k$NN) framework with two $k$NN modules, to retrieve several neighbors from the training set and augment the distribution of labels. For the $k$NN module, it is easily confused and may cause incorrect predictions when retrieving some nearest neighbors from noisy datasets (datasets with labeling errors) or similar datasets (datasets with similar labels). To address this issue, we also introduce a label distribution learning module that can learn label similarity, and generate a better label distribution to help models distinguish texts more effectively. This module eases model overfitting and improves final classification performance, hence enhancing the quality of the retrieved neighbors by $k$NN modules during inference. Extensive experiments on the benchmark datasets verify the effectiveness of our method.

Label Distribution Learning-Enhanced Dual-KNN for Text Classification

TL;DR

This work tackles text classification by exploiting internal model information through a dual NN (DNN) framework that retrieves neighbors using both text embeddings and predicted label distributions. A label distribution learning (LL) module learns label similarity and uses contrastive learning to produce more discriminative label representations, improving both the base model and the quality of retrieved neighbors. Empirical results across five datasets show consistent accuracy gains and enhanced robustness to noisy labels, outperforming various baselines and ablations. The approach advances retrieval-augmented NLP by leveraging intermediate representations and label correlations to enable more robust and reliable predictions.

Abstract

Many text classification methods usually introduce external information (e.g., label descriptions and knowledge bases) to improve the classification performance. Compared to external information, some internal information generated by the model itself during training, like text embeddings and predicted label probability distributions, are exploited poorly when predicting the outcomes of some texts. In this paper, we focus on leveraging this internal information, proposing a dual nearest neighbor (DNN) framework with two NN modules, to retrieve several neighbors from the training set and augment the distribution of labels. For the NN module, it is easily confused and may cause incorrect predictions when retrieving some nearest neighbors from noisy datasets (datasets with labeling errors) or similar datasets (datasets with similar labels). To address this issue, we also introduce a label distribution learning module that can learn label similarity, and generate a better label distribution to help models distinguish texts more effectively. This module eases model overfitting and improves final classification performance, hence enhancing the quality of the retrieved neighbors by NN modules during inference. Extensive experiments on the benchmark datasets verify the effectiveness of our method.

Paper Structure

This paper contains 30 sections, 10 equations, 4 figures, 6 tables.

Figures (4)

  • Figure 1: We use dots of different colors to denote different classes. (a) Previous work based on $k$NN may retrieve neighbors (the dots in the black circle) belonging to other classes (green dots) when the target text (deep blue dots) in similar datasets is relatively hard to distinguish (classification boundaries are close). (b) Our proposed label distribution learning can improve the performance of models (different class clusters get tighter and away from classification boundaries), thus the quality of retrieved neighbors in $k$NN is enhanced (the dots in the black circle are all blue).
  • Figure 2: The overall framework of our proposed method. The representation store contains a set of representation-label pairs, which are extracted from the hidden states of the label distribution learning enhanced model. During inference, when querying $k$ nearest neighbors from the representation store according to the similarity distance, the similarity distances are converted to $k$NN prediction distribution. Interpolating the $k$NN distribution with the vanilla model prediction distribution, we get the final distribution.
  • Figure 3: Hyperparameter settings of the D$k$NN.
  • Figure 4: visualize the cached in the representation store by t-SNE tool.