Memory Injections: Correcting Multi-Hop Reasoning Failures during Inference in Transformer-Based Language Models
Mansi Sakarvadia, Aswathy Ajith, Arham Khan, Daniel Grzenda, Nathaniel Hudson, André Bauer, Kyle Chard, Ian Foster
TL;DR
This paper tackles the challenge of unreliable multi-hop reasoning in transformer-based LLMs by identifying attention-head mechanisms that retrieve memories during inference. It proposes a lightweight memory-injection method that adds prompt-relevant memories directly into the hidden activations at selected attention layers, enabling the model to recall intermediary information and improve multi-hop completions. Through curated experiments on GPT-2 variants with programmatic and human-generated datasets, the approach shows significant gains in the probability of the correct next token for multi-hop prompts, while random injections generally harm performance. The work advances interpretability and knowledge editing in LLMs, suggesting scalable online memory augmentation but also noting limitations and ethics considerations around biases and potential misuse.
Abstract
Answering multi-hop reasoning questions requires retrieving and synthesizing information from diverse sources. Large Language Models (LLMs) struggle to perform such reasoning consistently. Here we propose an approach to pinpoint and rectify multi-hop reasoning failures through targeted memory injections on LLM attention heads. First, we analyze the per-layer activations of GPT-2 models in response to single and multi-hop prompts. We then propose a mechanism that allows users to inject pertinent prompt-specific information, which we refer to as "memories," at critical LLM locations during inference. By thus enabling the LLM to incorporate additional relevant information during inference, we enhance the quality of multi-hop prompt completions. We show empirically that a simple, efficient, and targeted memory injection into a key attention layer can often increase the probability of the desired next token in multi-hop tasks, by up to 424%.
