Table of Contents
Fetching ...

Tell Me What To Learn: Generalizing Neural Memory to be Controllable in Natural Language

Max S. Bennett, Thomas P. Zollo, Richard Zemel

TL;DR

This work proposes a generalized neural memory system that performs flexible updates based on learning instructions specified in natural language that enables adaptive agents to learn selectively from heterogeneous information sources, supporting settings, such as healthcare and customer service, where fixed-objective memory updates are insufficient.

Abstract

Modern machine learning models are deployed in diverse, non-stationary environments where they must continually adapt to new tasks and evolving knowledge. Continual fine-tuning and in-context learning are costly and brittle, whereas neural memory methods promise lightweight updates with minimal forgetting. However, existing neural memory models typically assume a single fixed objective and homogeneous information streams, leaving users with no control over what the model remembers or ignores over time. To address this challenge, we propose a generalized neural memory system that performs flexible updates based on learning instructions specified in natural language. Our approach enables adaptive agents to learn selectively from heterogeneous information sources, supporting settings, such as healthcare and customer service, where fixed-objective memory updates are insufficient.

Tell Me What To Learn: Generalizing Neural Memory to be Controllable in Natural Language

TL;DR

This work proposes a generalized neural memory system that performs flexible updates based on learning instructions specified in natural language that enables adaptive agents to learn selectively from heterogeneous information sources, supporting settings, such as healthcare and customer service, where fixed-objective memory updates are insufficient.

Abstract

Modern machine learning models are deployed in diverse, non-stationary environments where they must continually adapt to new tasks and evolving knowledge. Continual fine-tuning and in-context learning are costly and brittle, whereas neural memory methods promise lightweight updates with minimal forgetting. However, existing neural memory models typically assume a single fixed objective and homogeneous information streams, leaving users with no control over what the model remembers or ignores over time. To address this challenge, we propose a generalized neural memory system that performs flexible updates based on learning instructions specified in natural language. Our approach enables adaptive agents to learn selectively from heterogeneous information sources, supporting settings, such as healthcare and customer service, where fixed-objective memory updates are insufficient.
Paper Structure (54 sections, 5 equations, 13 figures, 13 tables)

This paper contains 54 sections, 5 equations, 13 figures, 13 tables.

Figures (13)

  • Figure 1: An AI system with memory can adapt to its environment continuously by integrating diverse information sources. We propose a generalized neural memory system that performs flexible long-term updates based on learning instructions specified in natural language. Our approach enables important use cases in critical domains such as healthcare, where an adaptive agent must learn from heterogeneous documents that preclude using a neural memory system with a fixed objective.
  • Figure 2: Examples from our benchmark, including a document and a sample of three possible learning instructions, each with a sample of possible queries and correct responses.
  • Figure 3: Continual Learning of Targeted Facts (Heatmaps). Shows average performance on each desiderata across an episode. The x-axis 'Documents Seen' represents the sequence index that the memory state is in (e.g., x-axis value of $i$ represents the memory state after seeing the first $i$ documents in an episode). The y-axis 'Query Timestep' represents the sequence element that a set of queries derives from (e.g., y-axis value of $j$ represents the queries that are sampled from the document-instruction pair that was the $j$th element in the episode sequence). The bottom diagonal is ignored because these represent queries for document-instruction pairs that have not yet been seen.
  • Figure 4: Continual Learning of Targeted Facts (Overall Performance). The left chart ('Total Performance') shows the average performance across all time steps. 'Score' is the harmonic mean of these averages across accuracy, specificity, and selectivity. 'Base' is the performance of Llama-3 probed on only queries, without any document or learning instruction. We show base model performance for reference. The base model naturally does well on specificity and selectivity, as it represents the performance of the model without any interference from new facts. The right chart ('Performance By Recency') reports the 'score' of the queries associated with document learned $x$ steps ago, where $x$ can be 0, 4, or 9. Error bars show 95% confidence intervals (CI).
  • Figure 5: Continual Learning of Knowledge, Styles, and Behaviors. Here we report results of our 'Continual Learning of Knowledge, Styles, and Behaviors' experiment. For the left three charts, we report results by recency to evaluate retention performance over the course of an episode. For the plot on the far right, we show the inference cost in FLOPs per token based on how many document-instruction pairs have been learned. Error bars show 95% CI. Full details are reported in Appendix \ref{['appendix:exp_2_detailed_results']}.
  • ...and 8 more figures