Table of Contents
Fetching ...

Auto-ICL: In-Context Learning without Human Supervision

Jinghan Yang, Shuming Ma, Furu Wei

TL;DR

This work tackles the bottleneck of human-crafted in-context learning by introducing Automatic In-Context Learning (Auto-ICL), where the model itself generates demonstrations and instructions to solve problems. It formalizes two generation modes (retrieving vs generating) and two forms of instructional content (1-to-1 and N-to-1), enabling flexible construction of problem-solving context. Empirical results across diverse reasoning domains and multiple models show that Auto-ICL-generated contexts can outperform human-annotated contexts and existing self-generation approaches like Zero-CoT and Auto-CoT, with nuanced trade-offs between efficiency and accuracy depending on the retrieval setup. The framework reduces reliance on human labeling, expands applicability to tasks challenging for humans, and highlights the value of combining demonstrations with high-level instructions for robust reasoning across datasets such as Theory of Mind, symbolic reasoning, arithmetic, and others.

Abstract

With in-context learning ability, the performance of large language models can be significantly boosted when provided with appropriate context. However, existing in-context learning methods mainly rely on human-provided contexts, such as labeled examples and explicit instructions. Writing context by humans is labor-intensive on various tasks and limits the model to tasks manageable by humans. To overcome these limitations, we propose Automatic In-Context Learning framework that enables the model to autonomously generate examples and instructions for problem-solving. With experiments across various models and datasets, results show that model-generated contexts outperform human-annotated contexts, including Few-Shot and Few-Shot-CoT methods, and surpass existing self-generated context methods like Zero-CoT and Auto-CoT.

Auto-ICL: In-Context Learning without Human Supervision

TL;DR

This work tackles the bottleneck of human-crafted in-context learning by introducing Automatic In-Context Learning (Auto-ICL), where the model itself generates demonstrations and instructions to solve problems. It formalizes two generation modes (retrieving vs generating) and two forms of instructional content (1-to-1 and N-to-1), enabling flexible construction of problem-solving context. Empirical results across diverse reasoning domains and multiple models show that Auto-ICL-generated contexts can outperform human-annotated contexts and existing self-generation approaches like Zero-CoT and Auto-CoT, with nuanced trade-offs between efficiency and accuracy depending on the retrieval setup. The framework reduces reliance on human labeling, expands applicability to tasks challenging for humans, and highlights the value of combining demonstrations with high-level instructions for robust reasoning across datasets such as Theory of Mind, symbolic reasoning, arithmetic, and others.

Abstract

With in-context learning ability, the performance of large language models can be significantly boosted when provided with appropriate context. However, existing in-context learning methods mainly rely on human-provided contexts, such as labeled examples and explicit instructions. Writing context by humans is labor-intensive on various tasks and limits the model to tasks manageable by humans. To overcome these limitations, we propose Automatic In-Context Learning framework that enables the model to autonomously generate examples and instructions for problem-solving. With experiments across various models and datasets, results show that model-generated contexts outperform human-annotated contexts, including Few-Shot and Few-Shot-CoT methods, and surpass existing self-generated context methods like Zero-CoT and Auto-CoT.
Paper Structure (25 sections, 3 equations, 4 figures, 29 tables)

This paper contains 25 sections, 3 equations, 4 figures, 29 tables.

Figures (4)

  • Figure 1: The problem-solving process is broken down into two steps. In Step 1, the model is prompted to generate contextual information (in the demonstrations or instruction form) that aids in answering the given query. In Step 2, the model is provided with the query and its self-generated context to produce the result. The generated demonstrations and instructions can be stored in a database and can be further refined by LLM.
  • Figure 2: Left: An example to generate demonstration by the model. Right: Given queries and reasoning paths, the model generates a general instruction. To generate instructions, demonstrations can from retrieved set or generated set.
  • Figure 3: An illustration of how to use generated context for problem-solving in Auto-ICL (retrieving) and Auto-ICL (generating). We place the Few-Shot-CoT method in the upper-left corner for comparison. In Auto-ICL (retrieving), the instruction is applied to all queries in the same dataset. In Auto-ICL (generating), the examples and instructions are tailored from each query. The generation process of demonstrations and instructions are shown in Figure \ref{['fig:figure2']}.
  • Figure 4: A comparison of instruction generated by different methods on AQUA dataset with GPT-3.5-turbo-0301.