Table of Contents
Fetching ...

Random Tree Model of Meaningful Memory

Weishun Zhong, Tankut Can, Antonis Georgiou, Ilya Shnayderman, Mikhail Katkov, Misha Tsodyks

TL;DR

The paper tackles how people recall meaningful narratives under working-memory constraints by introducing a random-tree memory model that encodes narratives as hierarchical counterparts with branching factor $K$ and depth $D$, and models recall as a deterministic traversal bounded by working memory. An analytical solution via stars-and-bars and spectral decomposition of a Markov chain yields a saturating recall length $C$ approaching $K^{D-1}$ and a scale-invariant compression distribution $f(s)$ for long narratives, with a notable link to the Riemann zeta function in special cases. Empirical data from 11 narratives (11 total, lengths $L$ between 19 and 194) collected from 100 subjects, plus mappings produced by large-language models, show sublinear growth of recall length with narrative size and increasing compression, in line with the theory, and demonstrate the central role of working-memory capacity in shaping narrative recall. The findings offer a quantitative framework for meaningful memory that captures key statistical regularities across narratives and suggests how memory representations and recall strategies scale with narrative length, with robustness checked via cross-model mappings.

Abstract

Traditional studies of memory for meaningful narratives focus on specific stories and their semantic structures but do not address common quantitative features of recall across different narratives. We introduce a statistical ensemble of random trees to represent narratives as hierarchies of key points, where each node is a compressed representation of its descendant leaves, which are the original narrative segments. Recall is modeled as constrained by working memory capacity from this hierarchical structure. Our analytical solution aligns with observations from large-scale narrative recall experiments. Specifically, our model explains that (1) average recall length increases sublinearly with narrative length, and (2) individuals summarize increasingly longer narrative segments in each recall sentence. Additionally, the theory predicts that for sufficiently long narratives, a universal, scale-invariant limit emerges, where the fraction of a narrative summarized by a single recall sentence follows a distribution independent of narrative length.

Random Tree Model of Meaningful Memory

TL;DR

The paper tackles how people recall meaningful narratives under working-memory constraints by introducing a random-tree memory model that encodes narratives as hierarchical counterparts with branching factor and depth , and models recall as a deterministic traversal bounded by working memory. An analytical solution via stars-and-bars and spectral decomposition of a Markov chain yields a saturating recall length approaching and a scale-invariant compression distribution for long narratives, with a notable link to the Riemann zeta function in special cases. Empirical data from 11 narratives (11 total, lengths between 19 and 194) collected from 100 subjects, plus mappings produced by large-language models, show sublinear growth of recall length with narrative size and increasing compression, in line with the theory, and demonstrate the central role of working-memory capacity in shaping narrative recall. The findings offer a quantitative framework for meaningful memory that captures key statistical regularities across narratives and suggests how memory representations and recall strategies scale with narrative length, with robustness checked via cross-model mappings.

Abstract

Traditional studies of memory for meaningful narratives focus on specific stories and their semantic structures but do not address common quantitative features of recall across different narratives. We introduce a statistical ensemble of random trees to represent narratives as hierarchies of key points, where each node is a compressed representation of its descendant leaves, which are the original narrative segments. Recall is modeled as constrained by working memory capacity from this hierarchical structure. Our analytical solution aligns with observations from large-scale narrative recall experiments. Specifically, our model explains that (1) average recall length increases sublinearly with narrative length, and (2) individuals summarize increasingly longer narrative segments in each recall sentence. Additionally, the theory predicts that for sufficiently long narratives, a universal, scale-invariant limit emerges, where the fraction of a narrative summarized by a single recall sentence follows a distribution independent of narrative length.

Paper Structure

This paper contains 9 sections, 43 equations, 5 figures.

Figures (5)

  • Figure 1: Ensemble of random trees. (a) Schematics of memory retrieval from a random hierarchical representation. An example of a single realization of the random tree created by the model for $N=42$ encoded clauses with branching ratio $K=4$ (empty nodes are not shown). Internal nodes are shown in green, while the terminal nodes (leaves) are shown in blue. The tree generating process starts with the whole narrative contained in the root node (level 1), which is subsequently split into up to 4 chunks at the next level. The splitting continues self-similarly until either a chunk fails to split further (e.g., the blue "4" at level 3) or becomes a single clause (the blue "1"s). The grey shaded area illustrate the limit imposed by working memory capacity as retrieval starts by descending from the root (retrieved nodes are shown in red). (b)-(c) Comparison between analytical solution and numerical simulations. (b) Mean recalled length $C$ as a function of encoded length $N$. Numerical simulations are averaged over $10^4$ realizations of random trees (see details in SI Sec. A). (c) Distribution of chunk size at the $D^{th}$ level $n^{(D)}$ ($D=4$), given root size $n^{(1)}=N$, range between tick marks in the y-axis corresponds to $[0,1]$. (d) Distribution of compression ratios scaled by $N$ as a function of the compression ratios divided by $N$. Simulations of different $N$ are shown in different shades of green. The red dashed line is the asymptotic scaling function from Eq. \ref{['eq:scaling_function']}.
  • Figure 2: Comparison between theory and experiment.(a) Average size of the tree memory representation of each narrative ($N$), estimated as explained in the text, plotted vs narrative length ($L$) for 11 narratives in the dataset. Dashed line corresponds to $N=0.5L$. (b) The mean number of recalled clauses $C$ vs average $N$, for all 11 narratives. Blue filled circles - data. Red dashed line - theoretical prediction obtained from Eqs. \ref{['eq:Pn']}-\ref{['eq:CvsK,D']} with $K=D=4$. Error bars in (a,b) are standard error of the mean. (c) Normalized empirical histograms of compression ratios for all subjects separately for each narrative, as measured from mapping recalled clauses back to the narrative clauses. Data for different narratives are shown in color corresponding to the colorbar marked with values of $N$ for each narrative. Solid lines - theoretical predictions obtained from Eq. \ref{['eq:Pn']} with $K=D=4$. Range between tick marks in y-axis is $[0,1]$. (d) The distribution of experimentally measured compression ratios relative to $N$ approaches the universal scale-invariant scaling function $f$ in Eq. \ref{['eq:scaling_function']} as $N$ increases.
  • Figure S1: Schematics for recall mappings. Top: $C$ recall clauses (referred to as "clauses" in the prompts). Middle: $L$ narrative clauses (referred to as "segments" in the prompts). Arrows indicate mappings generated by LLMs. Bottom: An example binarized mapping vector $\Vec{v}$ for the first recalled clause $r_1$, where mapped narrative clauses are assigned a value of 1, and unmapped ones are assigned a value of 0.
  • Figure S2: Comparison between two language models(a) Average size of the tree memory representation of each narrative ($N$) generated by GPT-4 vs DeepSeek-V3. The dashed line corresponds to the diagonal. (b) The mean number of recalled clauses $C$ vs. average $N$, for all 11 narratives. Blue filled circles - GPT-4 generated mappings. Black filled circles - DeepSeek-V3 generated mappings. Red dashed line - theoretical prediction for $K=D=4$, same as in the main text, Fig. 2(b). Error bars in (a,b) are standard error of the mean. (c) Normalized empirical histograms of compression ratios for all subjects separately for each narrative, as measured from DeepSeek-V3-generated mappings. Solid lines - theoretical predictions obtained from $K=D=4$, same as in the main text, Fig. 2(c). The range between tick marks on the y-axis is $[0,1]$. (d) The distribution of experimentally measured compression ratios relative to $N$ as mapped by DeepSeek-V3. The scaling function $f$ is the same as in the main text, Fig. 2(d).
  • Figure S3: Similarity between mappings generated by GPT-4 and DeepSeek-V3.(a-k) Distribution of normalized similarity scores $S$ for each mapped recall clauses across the 11 narratives analyzed in the main text. The inset shows the averaged similarity score $\langle S \rangle$ within each bin versus compression ratios $n$, where the bins are chosen uniformly on a linear scale for (a)-(e) and uniformly on a logarithmically scale for (f)-(k). (l) The fraction of recall clauses with a perfect maximum similarity score between the two mappings ($S=1$) vs. narrative length $L$.