A representational framework for learning and encoding structurally enriched trajectories in complex agent environments
Corina Catarau-Cotutiu, Esther Mondragon, Eduardo Alonso
TL;DR
SETLE introduces Structurally Enriched Trajectories (SETs) and a Hierarchical Memory Graph to represent task execution across objects, interactions, affordances, and states. By encoding SETs with a heterogeneous graph encoder inspired by HeCo and training via contrastive and triplet losses, SETLE learns embeddings that capture cross-episode structure and transfer across CREATE and MiniGrid. Integrating SETLE into a reinforcement learning loop with memory retrieval, adapters, penalties, and soft updates yields improved sample efficiency, stability, and generalisation in both physically grounded and symbolic tasks, despite challenges in sparse-reward settings. The work suggests structured trajectory representations as a path toward lifelong, transferable intelligence that leverages cross-task commonalities beyond perceptual similarity.
Abstract
The ability of artificial intelligence agents to make optimal decisions and generalise them to different domains and tasks is compromised in complex scenarios. One way to address this issue has focused on learning efficient representations of the world and on how the actions of agents affect them in state-action transitions. Whereas such representations are procedurally efficient, they lack structural richness. To address this problem, we propose to enhance the agent's ontology and extend the traditional conceptualisation of trajectories to provide a more nuanced view of task execution. Structurally Enriched Trajectories (SETs) extend the encoding of sequences of states and their transitions by incorporating hierarchical relations between objects, interactions, and affordances. SETs are built as multi-level graphs, providing a detailed representation of the agent dynamics and a transferable functional abstraction of the task. SETs are integrated into an architecture, Structurally Enriched Trajectory Learning and Encoding (SETLE), that employs a heterogeneous graph-based memory structure of multi-level relational dependencies essential for generalisation. We demonstrate that SETLE can support downstream tasks, enabling agents to recognise task relevant structural patterns across CREATE and MiniGrid environments. Finally, we integrate SETLE with reinforcement learning and show measurable improvements in downstream performance, including breakthrough success rates in complex, sparse-reward tasks.
