Table of Contents
Fetching ...

"In-Context Learning" or: How I learned to stop worrying and love "Applied Information Retrieval"

Andrew Parry, Debasis Ganguly, Manish Chandra

TL;DR

This paper reframes In-Context Learning (ICL) through the lens of Information Retrieval (IR) and proposes three IR-inspired directions to enhance downstream ICL: adaptive selection of the number of demonstrations via query performance prediction (QPP) and learned κ(x), learning-to-rank approaches to order exemplars by downstream usefulness, and diversification/faceted IR to ensure informative, diverse prompts. It formalizes ICL with P(y|x) = f(x, P_{k}(x); φ_{LLM}) and explores how IR problems like QPP, ranking, and diversity map to ICL components, plus a preliminary evaluation showing that supervised adaptive ICL (SAICL) can outperform static ICL while unsupervised QPP-based methods may underperform. The work highlights a concrete agenda for cross-disciplinary methods, suggesting that efficient, task-specific selection and combination of ICL exemplars can meaningfully improve real-world NLP tasks. It also provides a practical evaluation setup using GPT-J-6B across standard text classification benchmarks, demonstrating reduced context length and improved accuracy when employing data-driven context selection.

Abstract

With the increasing ability of large language models (LLMs), in-context learning (ICL) has evolved as a new paradigm for natural language processing (NLP), where instead of fine-tuning the parameters of an LLM specific to a downstream task with labeled examples, a small number of such examples is appended to a prompt instruction for controlling the decoder's generation process. ICL, thus, is conceptually similar to a non-parametric approach, such as $k$-NN, where the prediction for each instance essentially depends on the local topology, i.e., on a localised set of similar instances and their labels (called few-shot examples). This suggests that a test instance in ICL is analogous to a query in IR, and similar examples in ICL retrieved from a training set relate to a set of documents retrieved from a collection in IR. While standard unsupervised ranking models can be used to retrieve these few-shot examples from a training set, the effectiveness of the examples can potentially be improved by re-defining the notion of relevance specific to its utility for the downstream task, i.e., considering an example to be relevant if including it in the prompt instruction leads to a correct prediction. With this task-specific notion of relevance, it is possible to train a supervised ranking model (e.g., a bi-encoder or cross-encoder), which potentially learns to optimally select the few-shot examples. We believe that the recent advances in neural rankers can potentially find a use case for this task of optimally choosing examples for more effective downstream ICL predictions.

"In-Context Learning" or: How I learned to stop worrying and love "Applied Information Retrieval"

TL;DR

This paper reframes In-Context Learning (ICL) through the lens of Information Retrieval (IR) and proposes three IR-inspired directions to enhance downstream ICL: adaptive selection of the number of demonstrations via query performance prediction (QPP) and learned κ(x), learning-to-rank approaches to order exemplars by downstream usefulness, and diversification/faceted IR to ensure informative, diverse prompts. It formalizes ICL with P(y|x) = f(x, P_{k}(x); φ_{LLM}) and explores how IR problems like QPP, ranking, and diversity map to ICL components, plus a preliminary evaluation showing that supervised adaptive ICL (SAICL) can outperform static ICL while unsupervised QPP-based methods may underperform. The work highlights a concrete agenda for cross-disciplinary methods, suggesting that efficient, task-specific selection and combination of ICL exemplars can meaningfully improve real-world NLP tasks. It also provides a practical evaluation setup using GPT-J-6B across standard text classification benchmarks, demonstrating reduced context length and improved accuracy when employing data-driven context selection.

Abstract

With the increasing ability of large language models (LLMs), in-context learning (ICL) has evolved as a new paradigm for natural language processing (NLP), where instead of fine-tuning the parameters of an LLM specific to a downstream task with labeled examples, a small number of such examples is appended to a prompt instruction for controlling the decoder's generation process. ICL, thus, is conceptually similar to a non-parametric approach, such as -NN, where the prediction for each instance essentially depends on the local topology, i.e., on a localised set of similar instances and their labels (called few-shot examples). This suggests that a test instance in ICL is analogous to a query in IR, and similar examples in ICL retrieved from a training set relate to a set of documents retrieved from a collection in IR. While standard unsupervised ranking models can be used to retrieve these few-shot examples from a training set, the effectiveness of the examples can potentially be improved by re-defining the notion of relevance specific to its utility for the downstream task, i.e., considering an example to be relevant if including it in the prompt instruction leads to a correct prediction. With this task-specific notion of relevance, it is possible to train a supervised ranking model (e.g., a bi-encoder or cross-encoder), which potentially learns to optimally select the few-shot examples. We believe that the recent advances in neural rankers can potentially find a use case for this task of optimally choosing examples for more effective downstream ICL predictions.
Paper Structure (29 sections, 3 equations, 3 figures, 1 table, 2 algorithms)

This paper contains 29 sections, 3 equations, 3 figures, 1 table, 2 algorithms.

Figures (3)

  • Figure 1: A workflow diagram illustrating how three verticals of IR research fit into the workflow of in-context learning (ICL). Section \ref{['sec:aicl']} discusses possible ways of adjusting unsupervised and supervised QPP approaches for adapting the number of ICL examples. Section \ref{['sec:sicl']} discusses ideas of how to learn the notion of downstream usefulness of examples. Section \ref{['sec:dicl']} discusses methodologies related to diversifying examples for ICL.
  • Figure 2: Example workflow of In-Context Learning for sentiment classification. The illustrative example shows a sample test instance for which a single demonstration (as retrieved from the training set) does not result in the correct prediction (prediction shown at the top). The example also shows that increasing the number of demonstrations from one to two results in the correct prediction (shown at the bottom). Demonstrations included within the prompt are shown in blue.
  • Figure 3: Motivation behind using a variable sized neighborhood for $k$-NN classification learning_k_knn: An instance close to a decision boundary (black '?') is likely to have a higher heterogeneity in its class distribution, thus indicating the necessity of a larger neighborhood for an effective classification.