Facts as First Class Objects: Knowledge Objects for Persistent LLM Memory

Oliver Zahn; Simran Chana

Facts as First Class Objects: Knowledge Objects for Persistent LLM Memory

Oliver Zahn, Simran Chana

Abstract

Large language models increasingly serve as persistent knowledge workers, with in-context memory - facts stored in the prompt - as the default strategy. We benchmark in-context memory against Knowledge Objects (KOs), discrete hash-addressed tuples with O(1) retrieval. Within the context window, Claude Sonnet 4.5 achieves 100% exact-match accuracy from 10 to 7,000 facts (97.5% of its 200K window). However, production deployment reveals three failure modes: capacity limits (prompts overflow at 8,000 facts), compaction loss (summarization destroys 60% of facts), and goal drift (cascading compaction erodes 54% of project constraints while the model continues with full confidence). KOs achieve 100% accuracy across all conditions at 252x lower cost. On multi-hop reasoning, KOs reach 78.9% versus 31.6% for in-context. Cross-model replication across four frontier models confirms compaction loss is architectural, not model-specific. We additionally show that embedding retrieval fails on adversarial facts (20% precision at 1) and that neural memory (Titans) stores facts but fails to retrieve them on demand. We introduce density-adaptive retrieval as a switching mechanism and release the benchmark suite.

Facts as First Class Objects: Knowledge Objects for Persistent LLM Memory

Abstract

Paper Structure (58 sections, 5 equations, 7 figures, 12 tables, 1 algorithm)

This paper contains 58 sections, 5 equations, 7 figures, 12 tables, 1 algorithm.

Introduction
The Problem: Context Rot
Capacity limits.
Compaction loss.
Goal drift.
Our Approach: Knowledge Objects
Contributions
Scaling benchmark (§\ref{['sec:scaling']}).
Context rot quantification (§\ref{['sec:rot']}).
Density-adaptive retrieval (§\ref{['sec:density']}).
Production economics (§\ref{['sec:economics']}).
Roadmap.
Related Work
Dense Retrieval and Retrieval-Augmented Generation
Context Windows as Memory
...and 43 more sections

Figures (7)

Figure 1: Scaling curve: exact-match accuracy vs. corpus size ($N{=}10$ to $N{=}10{,}000$). Claude Sonnet 4.5 maintains 100% accuracy through $N{=}7{,}000$ (97.5% of context window), then overflows. GPT-4o drops to 0% by $N{=}3{,}000$. KO maintains 100% at all $N$. The dashed line marks the 200K token context window boundary.
Figure 2: Fact retrieval accuracy after 36.7$\times$ compaction. In-context memory loses 60% of facts; KO maintains 100%.
Figure 3: Goal drift under cascading compaction. Left: Stacked bars showing correct, partial, and lost constraints after each round. Right: Accuracy decay vs. compression ratio. KO maintains 100% regardless of compaction.
Figure 4: Multi-hop reasoning accuracy on 2-hop queries over a 500-fact corpus. KO-grounded retrieval achieves 78.9% accuracy, a 47.3 percentage point improvement over full in-context presentation (31.6%).
Figure 5: Cross-domain synthesis quality scores (1--5 scale) across four dimensions. The largest improvement is in groundedness (+118%), where KO retrieval enables claims traceable to specific stored facts.
...and 2 more figures

Facts as First Class Objects: Knowledge Objects for Persistent LLM Memory

Abstract

Facts as First Class Objects: Knowledge Objects for Persistent LLM Memory

Authors

Abstract

Table of Contents

Figures (7)