Table of Contents
Fetching ...

Like a Good Nearest Neighbor: Practical Content Moderation and Text Classification

Luke Bates, Iryna Gurevych

TL;DR

LaGoNN presents a practical, parameter-free augmentation to SetFit that decorates input text with information drawn from nearest neighbors in the training set, leveraging the distance-aware nature of Sentence Transformers to improve few-shot text classification for content moderation. By appending neighbor labels, distances, and texts to input instances (via configurations such as LabDist), and optionally fine-tuning the embedding encoder with a subset of data, LaGoNN delivers robust, often superior results across imbalanced, balanced, and multilingual settings while reducing computational cost relative to full-model fine-tuning. Across 17 datasets, LaGoNN demonstrates strong performance and stability, particularly when label distributions are skewed or domain drift is pronounced; LabDist emerges as the default, reliable configuration. The approach extends to general text classification, showing broad applicability and offering a practical path toward scalable, inexpensive content moderation and moderation-aware NLP systems.

Abstract

Few-shot text classification systems have impressive capabilities but are infeasible to deploy and use reliably due to their dependence on prompting and billion-parameter language models. SetFit (Tunstall et al., 2022) is a recent, practical approach that fine-tunes a Sentence Transformer under a contrastive learning paradigm and achieves similar results to more unwieldy systems. Inexpensive text classification is important for addressing the problem of domain drift in all classification tasks, and especially in detecting harmful content, which plagues social media platforms. Here, we propose Like a Good Nearest Neighbor (LaGoNN), a modification to SetFit that introduces no learnable parameters but alters input text with information from its nearest neighbor, for example, the label and text, in the training data, making novel data appear similar to an instance on which the model was optimized. LaGoNN is effective at flagging undesirable content and text classification, and improves the performance of SetFit. To demonstrate the value of LaGoNN, we conduct a thorough study of text classification systems in the context of content moderation under four label distributions, and in general and multilingual classification settings.

Like a Good Nearest Neighbor: Practical Content Moderation and Text Classification

TL;DR

LaGoNN presents a practical, parameter-free augmentation to SetFit that decorates input text with information drawn from nearest neighbors in the training set, leveraging the distance-aware nature of Sentence Transformers to improve few-shot text classification for content moderation. By appending neighbor labels, distances, and texts to input instances (via configurations such as LabDist), and optionally fine-tuning the embedding encoder with a subset of data, LaGoNN delivers robust, often superior results across imbalanced, balanced, and multilingual settings while reducing computational cost relative to full-model fine-tuning. Across 17 datasets, LaGoNN demonstrates strong performance and stability, particularly when label distributions are skewed or domain drift is pronounced; LabDist emerges as the default, reliable configuration. The approach extends to general text classification, showing broad applicability and offering a practical path toward scalable, inexpensive content moderation and moderation-aware NLP systems.

Abstract

Few-shot text classification systems have impressive capabilities but are infeasible to deploy and use reliably due to their dependence on prompting and billion-parameter language models. SetFit (Tunstall et al., 2022) is a recent, practical approach that fine-tunes a Sentence Transformer under a contrastive learning paradigm and achieves similar results to more unwieldy systems. Inexpensive text classification is important for addressing the problem of domain drift in all classification tasks, and especially in detecting harmful content, which plagues social media platforms. Here, we propose Like a Good Nearest Neighbor (LaGoNN), a modification to SetFit that introduces no learnable parameters but alters input text with information from its nearest neighbor, for example, the label and text, in the training data, making novel data appear similar to an instance on which the model was optimized. LaGoNN is effective at flagging undesirable content and text classification, and improves the performance of SetFit. To demonstrate the value of LaGoNN, we conduct a thorough study of text classification systems in the context of content moderation under four label distributions, and in general and multilingual classification settings.
Paper Structure (44 sections, 69 figures, 42 tables)

This paper contains 44 sections, 69 figures, 42 tables.

Figures (69)

  • Figure 1: LaGoNNLabDist uses an ST to encode training data, performs NN lookup, appends the NN's gold label and distance, and optionally SetFit to fine-tune the embedding model. We then embed this new instance and train a classifier. During inference, we use the embedding model to modify the test data with its NN's gold label and distance from the training data, compute the final representation, and call the classifier. Input text is in quotation marks, the NN's gold label and distance are in brackets, and the integer label is in parenthesis.
  • Figure 2: First row: performance for all LaGoNN configurations and balance regimes for the Hate Speech Offensive dataset. Second row: LaGoNN performance for one to five neighbors for all balance regimes on a collapsed version of the LIAR dataset. We use the LaGoNN$_{lite}$ fine-tuning strategy (see Section \ref{['setfit_many']}).
  • Figure 3: Average performance for all sampling regimes on Toxic Conversations. More expensive models, such as LaGoNN$_{exp}$, SetFit$_{exp}$, and RoBERTa$_{full}$ perform best when the label distribution is imbalanced. As the distribution becomes more balanced, inexpensive models, such as LaGoNN$_{lite}$, show similar or improved performance. The measure is average precision and we only consider one neighbor for the LaGoNN-based methods (see Appendix \ref{['sec:appendixadditional2']} for additional results).
  • Figure 4: Average performance on four datasets in the balanced sampling regime; the measure is average precision for IMDB, macro-f1 elsewhere. First row: SetFit$_{lite}$ compared to LaGoNN$_{exp}$LabDist with modifications for one to five neighbors. Second row: SetFit$_{exp}$ compared to LaGoNN$_{exp}$. See Appendix \ref{['sec:generalappendix']} for additional results.
  • Figure 5: We embed training data, retrieve the text, gold label, and distance for each instance from its nearest neighbor and modify the original text with this information. Then we embed the modified training data and train a classifier. During inference, the NN from the training data is selected, the original text is modified with the text, gold label, and distance from this NN, and the classifier is called.
  • ...and 64 more figures