Table of Contents
Fetching ...

Predictive Associative Memory: Retrieval Beyond Similarity Through Temporal Co-occurrence

Jason Dury

TL;DR

This work addresses a core limitation of similarity-based memory retrieval by introducing Predictive Associative Memory (PAM), which learns associative structure from temporal co-occurrence via a dual JEPA setup (Outward for forward similarity, Inward for memory-time associations). The Inward Predictor maps a current state to a predicted region in the embedding space containing past states that were temporally co-occurring, enabling faithful recall of experienced associations and cross-boundary retrieval where embedding similarity fails. In a synthetic benchmark, PAM achieves top-1 recall of true temporal associates with AP@1 = 0.970, Cross-Boundary Recall@20 = 0.421, and Discrimination AUCs of 0.916 overall and 0.849 for cross-room pairs, with temporal-shuffle controls confirming the signal reflects genuine temporal structure rather than geometry. The results demonstrate the potential for memory systems to go beyond similarity, enabling episodic specificity and transitive bridging, while outlining limitations such as the need for entity persistence and joint encoder training for full dual-channel interactions. This work lays a foundation for memory architectures that integrate similarity and associative retrieval, with implications for RAG, embodied AI, and the development of more faithful, experience-grounded memory substrates.

Abstract

Current approaches to memory in neural systems rely on similarity-based retrieval: given a query, find the most representationally similar stored state. This assumption -- that useful memories are similar memories -- fails to capture a fundamental property of biological memory: association through temporal co-occurrence. We propose Predictive Associative Memory (PAM), an architecture in which a JEPA-style predictor, trained on temporal co-occurrence within a continuous experience stream, learns to navigate the associative structure of an embedding space. We introduce an Inward JEPA that operates over stored experience (predicting associatively reachable past states) as the complement to the standard Outward JEPA that operates over incoming sensory data (predicting future states). We evaluate PAM as an associative recall system -- testing faithfulness of recall for experienced associations -- rather than as a retrieval system evaluated on generalisation to unseen associations. On a synthetic benchmark, the predictor's top retrieval is a true temporal associate 97% of the time (Association Precision@1 = 0.970); it achieves cross-boundary Recall@20 = 0.421 where cosine similarity scores zero; and it separates experienced-together from never-experienced-together states with a discrimination AUC of 0.916 (cosine: 0.789). Even restricted to cross-room pairs where embedding similarity is uninformative, the predictor achieves AUC = 0.849 (cosine: 0.503, chance). A temporal shuffle control confirms the signal is genuine temporal co-occurrence structure, not embedding geometry: shuffling collapses cross-boundary recall by 90%, replicated across training seeds. All results are stable across seeds (SD < 0.006) and query selections (SD $\leq$ 0.012).

Predictive Associative Memory: Retrieval Beyond Similarity Through Temporal Co-occurrence

TL;DR

This work addresses a core limitation of similarity-based memory retrieval by introducing Predictive Associative Memory (PAM), which learns associative structure from temporal co-occurrence via a dual JEPA setup (Outward for forward similarity, Inward for memory-time associations). The Inward Predictor maps a current state to a predicted region in the embedding space containing past states that were temporally co-occurring, enabling faithful recall of experienced associations and cross-boundary retrieval where embedding similarity fails. In a synthetic benchmark, PAM achieves top-1 recall of true temporal associates with AP@1 = 0.970, Cross-Boundary Recall@20 = 0.421, and Discrimination AUCs of 0.916 overall and 0.849 for cross-room pairs, with temporal-shuffle controls confirming the signal reflects genuine temporal structure rather than geometry. The results demonstrate the potential for memory systems to go beyond similarity, enabling episodic specificity and transitive bridging, while outlining limitations such as the need for entity persistence and joint encoder training for full dual-channel interactions. This work lays a foundation for memory architectures that integrate similarity and associative retrieval, with implications for RAG, embodied AI, and the development of more faithful, experience-grounded memory substrates.

Abstract

Current approaches to memory in neural systems rely on similarity-based retrieval: given a query, find the most representationally similar stored state. This assumption -- that useful memories are similar memories -- fails to capture a fundamental property of biological memory: association through temporal co-occurrence. We propose Predictive Associative Memory (PAM), an architecture in which a JEPA-style predictor, trained on temporal co-occurrence within a continuous experience stream, learns to navigate the associative structure of an embedding space. We introduce an Inward JEPA that operates over stored experience (predicting associatively reachable past states) as the complement to the standard Outward JEPA that operates over incoming sensory data (predicting future states). We evaluate PAM as an associative recall system -- testing faithfulness of recall for experienced associations -- rather than as a retrieval system evaluated on generalisation to unseen associations. On a synthetic benchmark, the predictor's top retrieval is a true temporal associate 97% of the time (Association Precision@1 = 0.970); it achieves cross-boundary Recall@20 = 0.421 where cosine similarity scores zero; and it separates experienced-together from never-experienced-together states with a discrimination AUC of 0.916 (cosine: 0.789). Even restricted to cross-room pairs where embedding similarity is uninformative, the predictor achieves AUC = 0.849 (cosine: 0.503, chance). A temporal shuffle control confirms the signal is genuine temporal co-occurrence structure, not embedding geometry: shuffling collapses cross-boundary recall by 90%, replicated across training seeds. All results are stable across seeds (SD < 0.006) and query selections (SD 0.012).
Paper Structure (77 sections, 6 equations, 6 figures, 5 tables)

This paper contains 77 sections, 6 equations, 6 figures, 5 tables.

Figures (6)

  • Figure 1: Association Precision at retrieval depths $k = 1, 5, 20$. The predictor's top retrieval is a true associate 97% of the time; precision dilutes at greater depth as the retrieval window extends beyond the tight association neighbourhood. Cosine and bilinear baselines score near zero at all depths. Error bars show $\pm 1$ SD across three training seeds.
  • Figure 2: Discrimination AUC --- overall vs cross-room. All three methods achieve above-chance overall discrimination by exploiting within-room geometric proximity. On cross-room pairs --- where the thesis claim lives --- cosine and bilinear drop to chance (dashed line) while the predictor maintains AUC = 0.849.
  • Figure 3: Temporal shuffle ablation. Randomly permuting temporal order within trajectories collapses cross-boundary recall by 90% and association precision by 92%, confirming that the predictor learned genuine temporal co-occurrence structure.
  • Figure 4: Held-out query-state evaluation. Train-anchor queries achieve CBR@20 = 0.508; queries never used as training anchors score zero. The predictor recalls associations from experienced viewpoints only, consistent with anchor-specific episodic recall.
  • Figure 5: Architecture selection progression. Cross-room R@20 improves 11.4$\times$ from baseline to final configuration (D2), driven primarily by data coverage and the interaction of capacity with coverage.
  • ...and 1 more figures