Table of Contents
Fetching ...

SuperLocalMemory V3.3: The Living Brain -- Biologically-Inspired Forgetting, Cognitive Quantization, and Multi-Channel Retrieval for Zero-LLM Agent Memory Systems

Varun Pratap Bhardwaj

Abstract

AI coding agents operate in a paradox: they possess vast parametric knowledge yet cannot remember a conversation from an hour ago. Existing memory systems store text in vector databases with single-channel retrieval, require cloud LLMs for core operations, and implement none of the cognitive processes that make human memory effective. We present SuperLocalMemory V3.3 ("The Living Brain"), a local-first agent memory system implementing the full cognitive memory taxonomy with mathematical lifecycle dynamics. Building on the information-geometric foundations of V3.2 (arXiv:2603.14588), we introduce five contributions: (1) Fisher-Rao Quantization-Aware Distance (FRQAD) -- a new metric on the Gaussian statistical manifold achieving 100% precision at preferring high-fidelity embeddings over quantized ones (vs 85.6% for cosine), with zero prior art; (2) Ebbinghaus Adaptive Forgetting with lifecycle-aware quantization -- the first mathematical forgetting curve in local agent memory coupled to progressive embedding compression, achieving 6.7x discriminative power; (3) 7-channel cognitive retrieval spanning semantic, keyword, entity graph, temporal, spreading activation, consolidation, and Hopfield associative channels, achieving 70.4% on LoCoMo in zero-LLM Mode A; (4) memory parameterization implementing Long-Term Implicit memory via soft prompts; (5) zero-friction auto-cognitive pipeline automating the complete memory lifecycle. On LoCoMo, V3.3 achieves 70.4% in Mode A (zero-LLM), with +23.8pp on multi-hop and +12.7pp on adversarial. V3.2 achieved 74.8% Mode A and 87.7% Mode C; the 4.4pp gap reflects a deliberate architectural trade-off. SLM V3.3 is open source under the Elastic License 2.0, runs entirely on CPU, with over 5,000 monthly downloads.

SuperLocalMemory V3.3: The Living Brain -- Biologically-Inspired Forgetting, Cognitive Quantization, and Multi-Channel Retrieval for Zero-LLM Agent Memory Systems

Abstract

AI coding agents operate in a paradox: they possess vast parametric knowledge yet cannot remember a conversation from an hour ago. Existing memory systems store text in vector databases with single-channel retrieval, require cloud LLMs for core operations, and implement none of the cognitive processes that make human memory effective. We present SuperLocalMemory V3.3 ("The Living Brain"), a local-first agent memory system implementing the full cognitive memory taxonomy with mathematical lifecycle dynamics. Building on the information-geometric foundations of V3.2 (arXiv:2603.14588), we introduce five contributions: (1) Fisher-Rao Quantization-Aware Distance (FRQAD) -- a new metric on the Gaussian statistical manifold achieving 100% precision at preferring high-fidelity embeddings over quantized ones (vs 85.6% for cosine), with zero prior art; (2) Ebbinghaus Adaptive Forgetting with lifecycle-aware quantization -- the first mathematical forgetting curve in local agent memory coupled to progressive embedding compression, achieving 6.7x discriminative power; (3) 7-channel cognitive retrieval spanning semantic, keyword, entity graph, temporal, spreading activation, consolidation, and Hopfield associative channels, achieving 70.4% on LoCoMo in zero-LLM Mode A; (4) memory parameterization implementing Long-Term Implicit memory via soft prompts; (5) zero-friction auto-cognitive pipeline automating the complete memory lifecycle. On LoCoMo, V3.3 achieves 70.4% in Mode A (zero-LLM), with +23.8pp on multi-hop and +12.7pp on adversarial. V3.2 achieved 74.8% Mode A and 87.7% Mode C; the 4.4pp gap reflects a deliberate architectural trade-off. SLM V3.3 is open source under the Elastic License 2.0, runs entirely on CPU, with over 5,000 monthly downloads.

Paper Structure

This paper contains 51 sections, 3 theorems, 13 equations, 4 figures, 9 tables.

Key Result

Theorem 4.1

For $b$-bit TurboQuant applied to any unit-norm vector $\mathbf{x} \in \mathbb{R}^d$: This is within $2.7\times$ of the information-theoretic lower bound. $\blacktriangleleft$$\blacktriangleleft$

Figures (4)

  • Figure 1: SLM V3.3 system architecture. The Interface Layer provides 60 MCP tools, a CLI with daemon serve mode (32$\times$ cold-start speedup), a 17-tab web dashboard, and auto-cognitive hooks for Claude Code. The Engine Layer implements 7-channel cognitive retrieval, Ebbinghaus lifecycle management with EAP precision scheduling, FRQAD/TurboQuant quantization (C1), soft prompt generation (C4), and a code knowledge graph module. The Storage Layer uses local SQLite databases with sqlite-vec for vector operations. Orange blocks indicate novel contributions.
  • Figure 2: Mixed-precision preference: percentage of 18,840 query-fact pairs where the f32 embedding is correctly preferred over the 4-bit quantized version. FRQAD achieves perfect precision (100%) by accounting for quantization uncertainty via variance inflation on the Fisher-Rao geodesic.
  • Figure 3: Ebbinghaus retention curves over 30 simulated days. Hot facts (daily access) converge toward the polar4 tier ($R \approx 0.35$). Warm facts (every 3 days) show a characteristic cyclic pattern. Cold facts decay immediately below the forget threshold. Dotted lines indicate EAP precision tier boundaries.
  • Figure 4: LoCoMo per-category comparison. V3.3 R3 surpasses Paper 2 on adversarial (+6.1pp) and substantially closes the multi-hop gap (+23.8pp from baseline). The single-hop regression ($-$14.9pp vs Paper 2) reflects 7-channel fusion complexity.

Theorems & Definitions (6)

  • Theorem 4.1: MSE Distortion Upper Bound turboquant
  • Definition 4.2: FRQAD
  • Proposition 4.3: Monotonic degradation
  • Definition 5.1: Memory Strength
  • Definition 5.2: Retention
  • Theorem 5.3: Convergence of Ebbinghaus-Fokker-Planck System