Table of Contents
Fetching ...

Mitigating Copy Bias in In-Context Learning through Neuron Pruning

Ameen Ali, Lior Wolf, Ivan Titov

TL;DR

This work creates a synthetic task and uses the Integrated Gradients method to identify neurons that prioritize copying over generalization, and demonstrates that pruning these neurons consistently improves performance across a diverse set of ICL tasks.

Abstract

Large language models (LLMs) have demonstrated impressive few-shot in-context learning (ICL) abilities. Still, we show that they are sometimes prone to a `copying bias', where they copy answers from provided examples instead of learning the underlying patterns. In this work, we propose a novel and simple method to mitigate such copying bias. First, we create a synthetic task and use the Integrated Gradients method to identify neurons that prioritize copying over generalization. We demonstrate that pruning these neurons consistently improves performance across a diverse set of ICL tasks. We also show that our method is applicable across various LLM architectures, including Transformers and State-Space Models, without requiring modifications. In our analysis, we adopt a task-recognition perspective on ICL and examine task vectors (Hendel et al., 2023) induced by the model. We find that pruning enhances the quality of these vectors, suggesting that the pruned neurons previously hindered effective task recognition.

Mitigating Copy Bias in In-Context Learning through Neuron Pruning

TL;DR

This work creates a synthetic task and uses the Integrated Gradients method to identify neurons that prioritize copying over generalization, and demonstrates that pruning these neurons consistently improves performance across a diverse set of ICL tasks.

Abstract

Large language models (LLMs) have demonstrated impressive few-shot in-context learning (ICL) abilities. Still, we show that they are sometimes prone to a `copying bias', where they copy answers from provided examples instead of learning the underlying patterns. In this work, we propose a novel and simple method to mitigate such copying bias. First, we create a synthetic task and use the Integrated Gradients method to identify neurons that prioritize copying over generalization. We demonstrate that pruning these neurons consistently improves performance across a diverse set of ICL tasks. We also show that our method is applicable across various LLM architectures, including Transformers and State-Space Models, without requiring modifications. In our analysis, we adopt a task-recognition perspective on ICL and examine task vectors (Hendel et al., 2023) induced by the model. We find that pruning enhances the quality of these vectors, suggesting that the pruned neurons previously hindered effective task recognition.
Paper Structure (21 sections, 8 equations, 6 figures, 6 tables)

This paper contains 21 sections, 8 equations, 6 figures, 6 tables.

Figures (6)

  • Figure 1: A high-level depiction of our proposed method of detecting copying neurons. First, in (a) we feed ICL prompts from the synthetic dataset. In this phase, we are only interested in the prompts where the model outputs a wrong response which also appears in the prompt examples. Second, in (b) we use these prompts and calculate the sum of the probabilities over predicted responses that appear in the prompt. This sum is used within the IG framework to attribute it to neurons in the targeted layer.
  • Figure 2: Percentage of total errors and copying errors for both the pruned and un-pruned models, results are shown for 3 ICL tasks across 3 different models: GPT2-Small, BLoom-560M, and OPT-1.3B. The dack bar in each diagram represents the unpruned version while the lighter bar represents the pruned version; the entire bar height represents the total error of the model and the shaded part represents the copying error rate.
  • Figure 3: Summary of the results over the synthetic ICL tasks, for more information on the tasks and the exact numbers, refer to Appendix \ref{['appx:results']}.
  • Figure 4: Results of Llama-2 and Llama-3 over SST2, SST5, and Object Counting task from BBH benhmark
  • Figure 5: Task-Vectors accuracies over OPT-2.7B and Bloom-560M models tested on (1) Singular Plural and (2) Country Capital ICL tasks. We show the Task-Vectors accuracies with and without pruning the detected copying errors, as can be seen, pruning the copying errors improves the quality of the extracted Task-Vectors across the different shots $\in [1, 2, 3, 4]$ for the two models and ICL tasks.
  • ...and 1 more figures