Pay Attention to What Matters
Pedro Luiz Silva, Antonio de Domenico, Ali Maatouk, Fadhel Ayed
TL;DR
This work introduces a simple and effective method, which is named GUIDE, that mechanistically increases attention scores in instruction tokens and presents Influence, a novel metric that highlights how the user's instructions propagate through the transformer layers and impact the LLM output.
Abstract
Despite the remarkable success of Large Language Models (LLMs), they still exhibit a limited capability to align their outputs to the user instructions. In this work, we introduce a simple and effective method, which we name GUIDE, that mechanistically increases attention scores in instruction tokens. To support this operation, we present Influence, a novel metric that highlights how the user's instructions propagate through the transformer layers and impact the LLM output. Our results show that GUIDE improves the accuracy of following instructions 29.4 % to 60.4%, outperforming natural prompting alternatives and Supervised Fine-Tuning up to 1M tokens.
