Temporal Knowledge-Graph Memory in a Partially Observable Environment
Taewoon Kim, Vincent François-Lavet, Michael Cochez
TL;DR
This work tackles how to represent and leverage long-term memory in partially observable environments by endowing both the world state and the agent’s memory with explicit knowledge-graph representations. It introduces Room Environment v3, a deterministic, KG-centered testbed where observations and the hidden state are RDF graphs, and memory can be extended to a temporal KG via RDF-star qualifiers. The study compares symbolic KG-based memory (RDF and RDF-star with time_added, last_accessed, and num_recalled) to neural sequence baselines (LSTM and Transformer) under identical observations and query conditions, across memory capacities. Findings show that temporal qualifiers substantially improve stability and generalization, with the symbolic TKG agent achieving roughly fourfold higher QA accuracy than neural baselines at high memory capacity, and symbolic memory enabling full room coverage and transparent memory evolution. The results demonstrate the value of interpretable, graph-structured memory in partially observable domains and provide a reproducible benchmarking platform for future neuro-symbolic memory research.
Abstract
Agents in partially observable environments require persistent memory to integrate observations over time. While KGs (knowledge graphs) provide a natural representation for such evolving state, existing benchmarks rarely expose agents to environments where both the world dynamics and the agent's memory are explicitly graph-shaped. We introduce the Room Environment v3, a configurable environment whose hidden state is an RDF KG and whose observations are RDF triples. The agent may extend these observations into a temporal KG when storing them in long-term memory. The environment is easily adjustable in terms of grid size, number of rooms, inner walls, and moving objects. We define a lightweight temporal KG memory for agents, based on RDF-star-style qualifiers (time_added, last_accessed, num_recalled), and evaluate several symbolic baselines that maintain and query this memory under different capacity constraints. Two neural sequence models (LSTM and Transformer) serve as contrasting baselines without explicit KG structure. Agents train on one layout and are evaluated on a held-out layout with the same dynamics but a different query order, exposing train-test generalization gaps. In this setting, temporal qualifiers lead to more stable performance, and the symbolic TKG (temporal knowledge graph) agent achieves roughly fourfold higher test QA (question-answer) accuracy than the neural baselines under the same environment and query conditions. The environment, agent implementations, and experimental scripts are released for reproducible research at https://github.com/humemai/agent-room-env-v3 and https://github.com/humemai/room-env.
