Schrodinger's Memory: Large Language Models

Wei Wang; Qing Li

Schrodinger's Memory: Large Language Models

Wei Wang, Qing Li

TL;DR

This paper argues that LLM memory operates like Schr\"odinger's memory, meaning that it only becomes observable when a specific memory is queried, and expands on this concept by comparing the memory capabilities of the human brain and LLMs, highlighting the similarities and differences in their operational mechanisms.

Abstract

Memory is the foundation of all human activities; without memory, it would be nearly impossible for people to perform any task in daily life. With the development of Large Language Models (LLMs), their language capabilities are becoming increasingly comparable to those of humans. But do LLMs have memory? Based on current performance, LLMs do appear to exhibit memory. So, what is the underlying mechanism of this memory? Previous research has lacked a deep exploration of LLMs' memory capabilities and the underlying theory. In this paper, we use Universal Approximation Theorem (UAT) to explain the memory mechanism in LLMs. We also conduct experiments to verify the memory capabilities of various LLMs, proposing a new method to assess their abilities based on these memory ability. We argue that LLM memory operates like Schrödinger's memory, meaning that it only becomes observable when a specific memory is queried. We can only determine if the model retains a memory based on its output in response to the query; otherwise, it remains indeterminate. Finally, we expand on this concept by comparing the memory capabilities of the human brain and LLMs, highlighting the similarities and differences in their operational mechanisms.

Schrodinger's Memory: Large Language Models

TL;DR

Abstract

Paper Structure (11 sections, 4 equations, 3 figures, 2 tables)

This paper contains 11 sections, 4 equations, 3 figures, 2 tables.

Introduction
UAT and LLMs
UAT
The UAT Format of Transformer-Based LLMs
The Memory of LLMs
The Definition of Memory
Dataset
The Memory Mechanism and Ability of LLMs
The Outputs Length Effect
A Comparision Between Human Brain and LLMs
Conclusion

Figures (3)

Figure 1: The basic block in Transformer-based LLMs.
Figure 2: The examples of right predictions of CN Poems: Qwen1.5-0.5B-Chat and ENG Poems: Qwen1.5-0.5B-Chat which were fine-tuned separately on CN Poems and ENG Poems and subsequently tested the memory ability on their respective datasets, accurately recited the entire poem based on the input.
Figure 3: The examples of wrong prediction of CN Poems: Qwen1.5-0.5B-Chat and ENG Poems: Qwen1.5-0.5B-Chat which were fine-tuned separately on CN Poems and ENG Poems.

Schrodinger's Memory: Large Language Models

TL;DR

Abstract

Schrodinger's Memory: Large Language Models

Authors

TL;DR

Abstract

Table of Contents

Figures (3)