Understanding Fact Recall in Language Models: Why Two-Stage Training Encourages Memorization but Mixed Training Teaches Knowledge
Ying Zhang, Benjamin Heinzerling, Dongyuan Li, Ryoma Ishigaki, Yuta Hitomi, Kentaro Inui
TL;DR
This paper investigates why two-stage training promotes memorization while mixed BIO+QA training teaches generalizable fact recall in language models. It introduces the cross-task gradient trace to identify shared parameters influenced by both fact-storing and fact-recalling data, and demonstrates that mixed training yields more and more centralized shared parameters than two-stage training. These shared parameters concentrate in critical attention heads and a subset of MLP neurons, forming a fact-recall circuit; targeted interventions (ablation, grafting, and circuit analysis) reveal their pivotal role in cross-form recall and parameter-efficient generalization. The findings illuminate the internal mechanisms of cross-task learning and offer a model-agnostic tool for interpretable analysis of fact recall, with implications for designing training strategies that promote knowledge teaching.
Abstract
Fact recall, the ability of language models (LMs) to retrieve specific factual knowledge, remains a challenging task despite their impressive general capabilities. Common training strategies often struggle to promote robust recall behavior with two-stage training, which first trains a model with fact-storing examples (e.g., factual statements) and then with fact-recalling examples (question-answer pairs), tending to encourage rote memorization rather than generalizable fact retrieval. In contrast, mixed training, which jointly uses both types of examples, has been empirically shown to improve the ability to recall facts, but the underlying mechanisms are still poorly understood. In this work, we investigate how these training strategies affect how model parameters are shaped during training and how these differences relate to their ability to recall facts. We introduce cross-task gradient trace to identify shared parameters, those strongly influenced by both fact-storing and fact-recalling examples. Our analysis on synthetic fact recall datasets with the Llama-3.2B and Pythia-2.8B models reveals that mixed training encouraging a larger and more centralized set of shared parameters. These findings suggest that the emergence of parameters may play a key role in enabling LMs to generalize factual knowledge across task formulations.
