Table of Contents
Fetching ...

PACIT: Unlocking the Power of Examples for Better In-Context Instruction Tuning

Tianci Xue, Ziqi Wang, Yixia Li, Yun Chen, Guanhua Chen

TL;DR

The PACIT method unlocks the power of examples by encouraging the model to actively learn to grasp the distinctions between the positive and negative examples instead of merely reading.

Abstract

Instruction tuning enhances the instruction following ability of large language models by finetuning with supervised instruction data. Previous work proposes in-context instruction tuning (ICIT) where specific positive or negative examples are incorporated into the prompt for better performance. In this work, we propose PACIT, a simple and effective in-context instruction tuning method, inspired by the pedagogical concept of desirable difficulty. The PACIT method unlocks the power of examples by encouraging the model to actively learn to grasp the distinctions between the positive and negative examples instead of merely reading. The model is expected to first verify the correctness of the provided example according to the task description, which is then set as the condition for generating a better response to the task instance. Our extensive experiments prove the effectiveness of PACIT, outperforming ICIT baseline on both in-domain and out-domain tasks up to 9.16 and 3.14 average ROUGE-L scores, respectively. Moreover, PACIT can notably enhance the performance of instruction tuning even when all positive and negative examples are generated with a self-instruct method.

PACIT: Unlocking the Power of Examples for Better In-Context Instruction Tuning

TL;DR

The PACIT method unlocks the power of examples by encouraging the model to actively learn to grasp the distinctions between the positive and negative examples instead of merely reading.

Abstract

Instruction tuning enhances the instruction following ability of large language models by finetuning with supervised instruction data. Previous work proposes in-context instruction tuning (ICIT) where specific positive or negative examples are incorporated into the prompt for better performance. In this work, we propose PACIT, a simple and effective in-context instruction tuning method, inspired by the pedagogical concept of desirable difficulty. The PACIT method unlocks the power of examples by encouraging the model to actively learn to grasp the distinctions between the positive and negative examples instead of merely reading. The model is expected to first verify the correctness of the provided example according to the task description, which is then set as the condition for generating a better response to the task instance. Our extensive experiments prove the effectiveness of PACIT, outperforming ICIT baseline on both in-domain and out-domain tasks up to 9.16 and 3.14 average ROUGE-L scores, respectively. Moreover, PACIT can notably enhance the performance of instruction tuning even when all positive and negative examples are generated with a self-instruct method.
Paper Structure (30 sections, 2 equations, 6 figures, 9 tables)

This paper contains 30 sections, 2 equations, 6 figures, 9 tables.

Figures (6)

  • Figure 1: The overview of Pacit. Pacit consists of two stages: Classification and Answering. (1) Classification: Judge the correctness of each provided example based on the task description and then take the self-reminder action (i.e., I should learn from correct examples and avoid wrong examples.). (2) Answering: Respond to the main task instruction conditioned on the classification results. Two stages are executed sequentially within a single data sample.
  • Figure 2: A concrete example of attention visualization for SuperNI (Few-Shot) and Pacit methods.
  • Figure 3: The training dynamics of the main task (ROUGE-L) v.s. the auxiliary classification task (Acc). Acc: The accuracy of classification. ROUGE-L: The performance of main tasks. The five data points represent five checkpoints obtained after each epoch.
  • Figure 4: The data template used for Pacit method.
  • Figure 5: The data template used for the classification task when training with separated two stages.
  • ...and 1 more figures