Table of Contents
Fetching ...

CCPrefix: Counterfactual Contrastive Prefix-Tuning for Many-Class Classification

Yang Li, Canran Xu, Guodong Long, Tao Shen, Chongyang Tao, Jing Jiang

TL;DR

This paper tackles verbalizer ambiguity that arises when prefix-tuning is applied to many-class classification. It introduces CCPrefix, a method that builds instance-dependent soft prefixes from fact-counterfactual label pairs, aligns these prefixes with global prototypes, and employs a Siamese training objective to stabilize learning. Across relation classification, topic classification, and entity typing, CCPrefix achieves state-of-the-art or strong gains in both fully supervised and few-shot settings, outperforming PTR, ProtoVerb, and PETAL baselines. The work reduces reliance on manually crafted prompts and label words, improving robustness to large label spaces and extending applicability to evolving language models while offering practical benefits for real-world NLP tasks.

Abstract

Recently, prefix-tuning was proposed to efficiently adapt pre-trained language models to a broad spectrum of natural language classification tasks. It leverages soft prefix as task-specific indicators and language verbalizers as categorical-label mentions to narrow the formulation gap from pre-training language models. However, when the label space increases considerably (i.e., many-class classification), such a tuning technique suffers from a verbalizer ambiguity problem since the many-class labels are represented by semantic-similar verbalizers in short language phrases. To overcome this, inspired by the human-decision process that the most ambiguous classes would be mulled over for each instance, we propose a brand-new prefix-tuning method, Counterfactual Contrastive Prefix-tuning (CCPrefix), for many-class classification. Basically, an instance-dependent soft prefix, derived from fact-counterfactual pairs in the label space, is leveraged to complement the language verbalizers in many-class classification. We conduct experiments on many-class benchmark datasets in both the fully supervised setting and the few-shot setting, which indicates that our model outperforms former baselines.

CCPrefix: Counterfactual Contrastive Prefix-Tuning for Many-Class Classification

TL;DR

This paper tackles verbalizer ambiguity that arises when prefix-tuning is applied to many-class classification. It introduces CCPrefix, a method that builds instance-dependent soft prefixes from fact-counterfactual label pairs, aligns these prefixes with global prototypes, and employs a Siamese training objective to stabilize learning. Across relation classification, topic classification, and entity typing, CCPrefix achieves state-of-the-art or strong gains in both fully supervised and few-shot settings, outperforming PTR, ProtoVerb, and PETAL baselines. The work reduces reliance on manually crafted prompts and label words, improving robustness to large label spaces and extending applicability to evolving language models while offering practical benefits for real-world NLP tasks.

Abstract

Recently, prefix-tuning was proposed to efficiently adapt pre-trained language models to a broad spectrum of natural language classification tasks. It leverages soft prefix as task-specific indicators and language verbalizers as categorical-label mentions to narrow the formulation gap from pre-training language models. However, when the label space increases considerably (i.e., many-class classification), such a tuning technique suffers from a verbalizer ambiguity problem since the many-class labels are represented by semantic-similar verbalizers in short language phrases. To overcome this, inspired by the human-decision process that the most ambiguous classes would be mulled over for each instance, we propose a brand-new prefix-tuning method, Counterfactual Contrastive Prefix-tuning (CCPrefix), for many-class classification. Basically, an instance-dependent soft prefix, derived from fact-counterfactual pairs in the label space, is leveraged to complement the language verbalizers in many-class classification. We conduct experiments on many-class benchmark datasets in both the fully supervised setting and the few-shot setting, which indicates that our model outperforms former baselines.
Paper Structure (27 sections, 10 equations, 4 figures, 5 tables, 1 algorithm)

This paper contains 27 sections, 10 equations, 4 figures, 5 tables, 1 algorithm.

Figures (4)

  • Figure 1: An illustrative example of entity typing task from FewNERD ding2021few dataset. Option A is its ground-truth label, and Option B is the counterfactual. Red words are the related attributes for the question.
  • Figure 2: Our proposed model, CCPrefix. For easy comprehension, we zoom out contrastive prefix construction and contrastive attributes generation in \ref{['sec:CPfC']}. The losses ${\mathcal{L}}_{\rm cls}$, ${\mathcal{L}}_{\rm s}$ and ${\mathcal{L}}_{\rm con}$ are defined in \ref{['eq:ins_loss']}, \ref{['eq:sym_loss']} and \ref{['eq:con_loss']}. The black line is the forward path for both training and inference, while the green line is the training path with supervised signal.
  • Figure 3: An illustration of the selection process of top-2 contrastive attributes ${\bm{c}}_{i,j}$ using the similarities between all possible ${\bm{c}}_{i,j}$ and their corresponding prototypes ${\bm{p}}_{i,j}$, where $i$-th class is fact and $j$-th class is its counterfactual.
  • Figure 4: The highlighted tokens of the same sentence where the two entities are underscored. On the left, the tokens are projected onto the ground truth y$^*$=per:city_of_birth, and on the right onto the contrastive space between y$^*$ and the counterfactual y'=per:city_of_death.