Table of Contents
Fetching ...

When LLMs are Unfit Use FastFit: Fast and Effective Text Classification with Many Classes

Asaf Yehudai, Elron Bendel

TL;DR

FastFit tackles the challenge of few-shot text classification with many semantically similar classes by learning a shared embedding space via batch contrastive training and token-level similarity. It integrates a pip-installable FastFit library that hooks into the Hugging Face trainer, enabling CLS or token-level similarity metrics, data augmentation, and class-name augmentation. Empirical results on the FewMany benchmark and MASSIVE multilingual dataset show FastFit achieving state-of-the-art or near-state-of-the-art performance with substantial training speedups (3–20×) compared to SetFit and LLM prompting baselines, and strong performance when trained on full data. The work demonstrates practical impact by delivering a fast, scalable, language-agnostic classifier that outperforms large LLMs and other fine-tuning methods in many-class few-shot settings.

Abstract

We present FastFit, a method, and a Python package design to provide fast and accurate few-shot classification, especially for scenarios with many semantically similar classes. FastFit utilizes a novel approach integrating batch contrastive learning and token-level similarity score. Compared to existing few-shot learning packages, such as SetFit, Transformers, or few-shot prompting of large language models via API calls, FastFit significantly improves multiclass classification performance in speed and accuracy across FewMany, our newly curated English benchmark, and Multilingual datasets. FastFit demonstrates a 3-20x improvement in training speed, completing training in just a few seconds. The FastFit package is now available on GitHub and PyPi, presenting a user-friendly solution for NLP practitioners.

When LLMs are Unfit Use FastFit: Fast and Effective Text Classification with Many Classes

TL;DR

FastFit tackles the challenge of few-shot text classification with many semantically similar classes by learning a shared embedding space via batch contrastive training and token-level similarity. It integrates a pip-installable FastFit library that hooks into the Hugging Face trainer, enabling CLS or token-level similarity metrics, data augmentation, and class-name augmentation. Empirical results on the FewMany benchmark and MASSIVE multilingual dataset show FastFit achieving state-of-the-art or near-state-of-the-art performance with substantial training speedups (3–20×) compared to SetFit and LLM prompting baselines, and strong performance when trained on full data. The work demonstrates practical impact by delivering a fast, scalable, language-agnostic classifier that outperforms large LLMs and other fine-tuning methods in many-class few-shot settings.

Abstract

We present FastFit, a method, and a Python package design to provide fast and accurate few-shot classification, especially for scenarios with many semantically similar classes. FastFit utilizes a novel approach integrating batch contrastive learning and token-level similarity score. Compared to existing few-shot learning packages, such as SetFit, Transformers, or few-shot prompting of large language models via API calls, FastFit significantly improves multiclass classification performance in speed and accuracy across FewMany, our newly curated English benchmark, and Multilingual datasets. FastFit demonstrates a 3-20x improvement in training speed, completing training in just a few seconds. The FastFit package is now available on GitHub and PyPi, presenting a user-friendly solution for NLP practitioners.
Paper Structure (23 sections, 2 equations, 5 figures, 14 tables)

This paper contains 23 sections, 2 equations, 5 figures, 14 tables.

Figures (5)

  • Figure 1: FastFit achieves SOTA classification results combined with fast training and high throughput. Outpreforming other fine-tuning methods and strong LLMs.
  • Figure 2: Training times (sec) for FastFit, SetFit, and standard classifier with MPNet model. FastFit training is 3-20x faster.
  • Figure 3: Average 5-shot Accuracy on the FewMany benchmark of various FastFit models over training time, measured in seconds, trained on an Nvidia A100-80GB GPU.
  • Figure 4: Average 5 and 10 shot Accuracy on the Few-Many benchmark of various FastFit models over training time, measured in seconds, trained on an Nvidia A100-80GB GPU.
  • Figure 5: Training times (sec) for FastFit, SetFit, and standard classifier. FastFit training is faster both for the small model (top) and for the large model (bottom).