Text2Mem: A Unified Memory Operation Language for Memory Operating System
Yi Wang, Lihai Yang, Boyu Chen, Gongyi Zou, Kerun Xu, Bo Tang, Feiyu Xiong, Siheng Chen, Zhiyu Li
TL;DR
Text2Mem tackles fragmentation in memory control for LLM agents by introducing a unified, schema-based memory operation language. It defines twelve verbs spanning encoding, storage, and retrieval, governed by a compact five-field operation schema and executed through a validator–parser–adapter pipeline, with backends including a SQL prototype and real memory frameworks. The approach emphasizes explicit semantics, safety invariants, and cross-backend portability, demonstrated through illustrative workflows such as semantic promotion and incident postmortems. Additionally, Text2Mem Bench provides an end-to-end, reproducible benchmark that separates planning from execution to evaluate both schema generation and actual memory effects. Together, the work establishes a formal foundation for reliable, auditable memory control in long-horizon agents and paves the way for reproducible research across backends.
Abstract
Large language model agents increasingly depend on memory to sustain long horizon interaction, but existing frameworks remain limited. Most expose only a few basic primitives such as encode, retrieve, and delete, while higher order operations like merge, promote, demote, split, lock, and expire are missing or inconsistently supported. Moreover, there is no formal and executable specification for memory commands, leaving scope and lifecycle rules implicit and causing unpredictable behavior across systems. We introduce Text2Mem, a unified memory operation language that provides a standardized pathway from natural language to reliable execution. Text2Mem defines a compact yet expressive operation set aligned with encoding, storage, and retrieval. Each instruction is represented as a JSON based schema instance with required fields and semantic invariants, which a parser transforms into typed operation objects with normalized parameters. A validator ensures correctness before execution, while adapters map typed objects either to a SQL prototype backend or to real memory frameworks. Model based services such as embeddings or summarization are integrated when required. All results are returned through a unified execution contract. This design ensures safety, determinism, and portability across heterogeneous backends. We also outline Text2Mem Bench, a planned benchmark that separates schema generation from backend execution to enable systematic evaluation. Together, these components establish the first standardized foundation for memory control in agents.
