ExpNote: Black-box Large Language Models are Better Task Solvers with Experience Notebook
Wangtao Sun, Xuanqing Yu, Shizhu He, Jun Zhao, Kang Liu
TL;DR
ExpNote tackles the challenge of adapting black-box LLMs to downstream tasks by introducing an automated Experience Notebook that stores task-specific insights in a dynamic external memory. The framework trains the LLM to generate and store experiences with minimal ground-truth feedback, then retrieves relevant experiences during testing to condition predictions, using memory-interaction commands like THINK, NOTE, and RECALL. Empirical results across CLUTRR, METS-CoV, EMOJI, and LETS show substantial gains over CoT and other baselines, with improvements correlating to the availability of both positive and negative experiences and to retrieval effectiveness. The approach enables effective task solving for black-box LLMs without annotated data, offering a practical pathway toward robust, memory-augmented reasoning in real-world applications, though it may be less effective for highly case-specific or creative tasks.
Abstract
Black-box Large Language Models (LLMs) have shown great power in solving various tasks and are considered general problem solvers. However, LLMs still fail in many specific tasks although understand the task instruction. In this paper, we focus on the problem of boosting the ability of black-box LLMs to solve downstream tasks. We propose ExpNote, an automated framework to help LLMs better adapt to unfamiliar tasks through reflecting and noting experiences from training data and retrieving them from external memory during testing. We evaluate ExpNote on multiple tasks and the experimental results demonstrate that the proposed method significantly improves the performance of black-box LLMs. The data and code are available at https://github.com/forangel2014/ExpNote
