Table of Contents
Fetching ...

CrossTune: Black-Box Few-Shot Classification with Label Enhancement

Danqing Luo, Chen Zhang, Yan Zhang, Haizhou Li

TL;DR

CrossTune presents a label-enhanced cross-attention framework for black-box, few-shot text classification that avoids costly prompt searches. By transforming labels into descriptive texts and aligning input representations with these label descriptions via multi-head cross-attention, it leverages a frozen LLM as a feature extractor. The method augments training data with ChatGPT-conditioned samples and uses a switch mechanism with a DeBERTa teacher to filter poor augmentations, achieving superior performance on seven benchmarks and showing data-efficient improvements even without augmentation. This work advances practical LMaaS adaptation by combining label semantics, data augmentation, and quality control to improve generalization in low-resource settings.

Abstract

Training or finetuning large-scale language models (LLMs) requires substantial computation resources, motivating recent efforts to explore parameter-efficient adaptation to downstream tasks. One approach is to treat these models as black boxes and use forward passes (Inference APIs) to interact with them. Current research focuses on adapting these black-box models to downstream tasks using gradient-free prompt optimization, but this often involves an expensive process of searching task-specific prompts. Therefore, we are motivated to study black-box language model adaptation without prompt search. Specifically, we introduce a label-enhanced cross-attention network called CrossTune, which models the semantic relatedness between the input text sequence and task-specific label descriptions. Its effectiveness is examined in the context of few-shot text classification. To improve the generalization of CrossTune, we utilize ChatGPT to generate additional training data through in-context learning. A switch mechanism is implemented to exclude low-quality ChatGPT-generated data. Through extensive experiments on seven benchmark text classification datasets, we demonstrate that our proposed approach outperforms the previous state-of-the-art gradient-free black-box tuning method by 5.7% on average. Even without using ChatGPT-augmented data, CrossTune performs better or comparably than previous black-box tuning methods, suggesting the effectiveness of our approach.

CrossTune: Black-Box Few-Shot Classification with Label Enhancement

TL;DR

CrossTune presents a label-enhanced cross-attention framework for black-box, few-shot text classification that avoids costly prompt searches. By transforming labels into descriptive texts and aligning input representations with these label descriptions via multi-head cross-attention, it leverages a frozen LLM as a feature extractor. The method augments training data with ChatGPT-conditioned samples and uses a switch mechanism with a DeBERTa teacher to filter poor augmentations, achieving superior performance on seven benchmarks and showing data-efficient improvements even without augmentation. This work advances practical LMaaS adaptation by combining label semantics, data augmentation, and quality control to improve generalization in low-resource settings.

Abstract

Training or finetuning large-scale language models (LLMs) requires substantial computation resources, motivating recent efforts to explore parameter-efficient adaptation to downstream tasks. One approach is to treat these models as black boxes and use forward passes (Inference APIs) to interact with them. Current research focuses on adapting these black-box models to downstream tasks using gradient-free prompt optimization, but this often involves an expensive process of searching task-specific prompts. Therefore, we are motivated to study black-box language model adaptation without prompt search. Specifically, we introduce a label-enhanced cross-attention network called CrossTune, which models the semantic relatedness between the input text sequence and task-specific label descriptions. Its effectiveness is examined in the context of few-shot text classification. To improve the generalization of CrossTune, we utilize ChatGPT to generate additional training data through in-context learning. A switch mechanism is implemented to exclude low-quality ChatGPT-generated data. Through extensive experiments on seven benchmark text classification datasets, we demonstrate that our proposed approach outperforms the previous state-of-the-art gradient-free black-box tuning method by 5.7% on average. Even without using ChatGPT-augmented data, CrossTune performs better or comparably than previous black-box tuning methods, suggesting the effectiveness of our approach.
Paper Structure (26 sections, 4 equations, 5 figures, 9 tables)

This paper contains 26 sections, 4 equations, 5 figures, 9 tables.

Figures (5)

  • Figure 1: Input template examples. The blue boxes contain the labels for the corresponding classification tasks.
  • Figure 2: System Overview of CrossTune.
  • Figure 3: Prompt-based finetuning of $\mathcal{A}_{deberta}$. The underlined text is the prompt template. In the bottom box, the first, second, and third lines are the input text sequence, the demonstration for label:negative, and the demonstration for label:positive respectively. The verbalizer maps the labels to the corresponding words.
  • Figure 4: T-SNE Plots of embeddings w.r.t. original training, test, and ChatGPT-augmented training data. Note that we randomly sample the same amount of in-distribution training data as the ChatGPT-augmented data from the original training set.
  • Figure 5: The performance of ChatGPT vs DeBERTa on the development set, which helps determine when to filter the ChatGPT-augmented data. A positive correlation can be observed between the performance of the teacher model on the development set and that of CrossTune on the test set across most datasets.