$k$NN Prompting: Beyond-Context Learning with Calibration-Free Nearest Neighbor Inference

Benfeng Xu; Quan Wang; Zhendong Mao; Yajuan Lyu; Qiaoqiao She; Yongdong Zhang

$k$NN Prompting: Beyond-Context Learning with Calibration-Free Nearest Neighbor Inference

Benfeng Xu, Quan Wang, Zhendong Mao, Yajuan Lyu, Qiaoqiao She, Yongdong Zhang

TL;DR

k$NN Prompting tackles two core ICL challenges: context-length limits and label-space calibration biases. By splitting data into demonstrations and anchors, it caches anchor LM distributions and performs KL-based nearest-neighbor retrieval to predict labels, achieving calibration-free and beyond-context learning. Extensive experiments across 10 datasets show robust data scaling from 2 to 1024 shots and improvements across LLM sizes, with qualitative analyses illuminating the mechanism: LM distributions excel at matching anchors while labels determine predictions. The approach bridges data scaling and model scaling in a gradient-free deployment, offering practical gains for real-world LLM applications.

Abstract

In-Context Learning (ICL), which formulates target tasks as prompt completion conditioned on in-context demonstrations, has become the prevailing utilization of LLMs. In this paper, we first disclose an actual predicament for this typical usage that it can not scale up with training data due to context length restriction. Besides, existing works have shown that ICL also suffers from various biases and requires delicate calibration treatment. To address both challenges, we advocate a simple and effective solution, $k$NN Prompting, which first queries LLM with training data for distributed representations, then predicts test instances by simply referring to nearest neighbors. We conduct comprehensive experiments to demonstrate its two-fold superiority: 1) Calibration-Free: $k$NN Prompting does not directly align LLM output distribution with task-specific label space, instead leverages such distribution to align test and training instances. It significantly outperforms state-of-the-art calibration-based methods under comparable few-shot scenario. 2) Beyond-Context: $k$NN Prompting can further scale up effectively with as many training data as are available, continually bringing substantial improvements. The scaling trend holds across 10 orders of magnitude ranging from 2 shots to 1024 shots as well as different LLMs scales ranging from 0.8B to 30B. It successfully bridges data scaling into model scaling, and brings new potentials for the gradient-free paradigm of LLM deployment. Code is publicly available.

$k$NN Prompting: Beyond-Context Learning with Calibration-Free Nearest Neighbor Inference

TL;DR

Abstract

NN Prompting, which first queries LLM with training data for distributed representations, then predicts test instances by simply referring to nearest neighbors. We conduct comprehensive experiments to demonstrate its two-fold superiority: 1) Calibration-Free:

NN Prompting does not directly align LLM output distribution with task-specific label space, instead leverages such distribution to align test and training instances. It significantly outperforms state-of-the-art calibration-based methods under comparable few-shot scenario. 2) Beyond-Context:

NN Prompting can further scale up effectively with as many training data as are available, continually bringing substantial improvements. The scaling trend holds across 10 orders of magnitude ranging from 2 shots to 1024 shots as well as different LLMs scales ranging from 0.8B to 30B. It successfully bridges data scaling into model scaling, and brings new potentials for the gradient-free paradigm of LLM deployment. Code is publicly available.

Paper Structure (38 sections, 7 equations, 18 figures, 11 tables)

This paper contains 38 sections, 7 equations, 18 figures, 11 tables.

Introduction
Background: In-Context Learning
$k$NN Prompting
Meta Test
Formal Test
Experiments
Setup
Data Utility
Data Utility Under Few Shot Scenario
Superiority of Whole LM Distribution
Data Utility Beyond the Context
Continually Scaling Up to Thousands of Training Data
Comparison to Demonstration Selection
Analyses and Explanation
Robustness w.r.t. Different Training Examples
...and 23 more sections

Figures (18)

Figure 1: $k$NN Prompting brings substantial improvements over standard ICL, and can continually scale up beyond the context with as many data as are available. Conducted with GPT XL.
Figure 2: ICL improves with num. of training examples but is limited by context length restriction.
Figure 3: The overall framework of $k$NN Prompting
Figure 4: Data scaling under few shot scenario. Compared with calibration-based methods.
Figure 5: Data scaling under fully supervised scenario. Conducted across various LLM scales.
...and 13 more figures

$k$NN Prompting: Beyond-Context Learning with Calibration-Free Nearest Neighbor Inference

TL;DR

Abstract

$k$NN Prompting: Beyond-Context Learning with Calibration-Free Nearest Neighbor Inference

Authors

TL;DR

Abstract

Table of Contents

Figures (18)