Leveraging Biases in Large Language Models: "bias-kNN'' for Effective Few-Shot Learning

Yong Zhang; Hanzhang Li; Zhitao Li; Ning Cheng; Ming Li; Jing Xiao; Jianzong Wang

Leveraging Biases in Large Language Models: "bias-kNN'' for Effective Few-Shot Learning

Yong Zhang, Hanzhang Li, Zhitao Li, Ning Cheng, Ming Li, Jing Xiao, Jianzong Wang

TL;DR

This work addresses biases inherent in large language models that hinder zero-shot and few-shot learning. It introduces bias-kNN, a method that converts biased LM outputs into kNN features and augments them with gold labels, enabling robust few-shot text classification across diverse domains and GPT-2 sizes. Through six datasets and multiple model variants, bias-kNN demonstrates competitive or superior accuracy compared with Zero-LM and Raw-ICL baselines, while showing stability to template and verbalizer choices. The findings suggest that strategically leveraging biases, rather than correcting them, can yield practical advantages for retrieval-based classification tasks in low-resource settings.

Abstract

Large Language Models (LLMs) have shown significant promise in various applications, including zero-shot and few-shot learning. However, their performance can be hampered by inherent biases. Instead of traditionally sought methods that aim to minimize or correct these biases, this study introduces a novel methodology named ``bias-kNN''. This approach capitalizes on the biased outputs, harnessing them as primary features for kNN and supplementing with gold labels. Our comprehensive evaluations, spanning diverse domain text classification datasets and different GPT-2 model sizes, indicate the adaptability and efficacy of the ``bias-kNN'' method. Remarkably, this approach not only outperforms conventional in-context learning in few-shot scenarios but also demonstrates robustness across a spectrum of samples, templates and verbalizers. This study, therefore, presents a unique perspective on harnessing biases, transforming them into assets for enhanced model performance.

Leveraging Biases in Large Language Models: "bias-kNN'' for Effective Few-Shot Learning

TL;DR

Abstract

Paper Structure (17 sections, 2 equations, 7 figures, 2 tables)

This paper contains 17 sections, 2 equations, 7 figures, 2 tables.

Introduction
Methodology
Bias Output based kNN Modeling
Experiment
Setup
Datasets
Evaluation
Baselines
Implementation Details
Main Results
Ablation Study and Analysis
Robustness of Templates and Verbalizers
Biased Logit as a Feature
Impact of Distance Metrics
Impact of the Number of Nearest Neighbors (k)
...and 2 more sections

Figures (7)

Figure 1: Zero-shot probability and logit results from the CR train dataset, visualizing 50 samples each from the Positive and Negative categories. The model exhibits a clear bias towards the Positive category. The dashed line $y=x$ denotes the decision boundary for these categories.
Figure 2: The architecture of our proposed model
Figure 3: Verbalizer: robustness analysis. The shaded region denotes the standard deviation. All figures are consistent. Dashed lines of the same color indicate the Zero-LM accuracy for the corresponding settings.
Figure 4: Template: robustness analysis
Figure 5: Performance of biased logit as feature
...and 2 more figures

Leveraging Biases in Large Language Models: "bias-kNN'' for Effective Few-Shot Learning

TL;DR

Abstract

Leveraging Biases in Large Language Models: "bias-kNN'' for Effective Few-Shot Learning

Authors

TL;DR

Abstract

Table of Contents

Figures (7)