Procedural Adherence and Interpretability Through Neuro-Symbolic Generative Agents
Raven Rothkopf, Hannah Tongxin Zeng, Mark Santolucito
TL;DR
This paper addresses the difficulty of achieving long-horizon coherence and interpretability in LLM-based agents. It proposes a neuro-symbolic framework that combines Temporal Stream Logic (TSL) with reactive synthesis to synthesize an automaton that governs high-level prompt decisions, ensuring procedural adherence and enabling interpretability independent of the LLM internals. Empirical results on a choose-your-own-adventure task show automaton-enhanced agents achieving approximately 96–99% adherence to temporal constraints, while pure-LLM baselines lag and exhibit more hallucinations and arithmetic errors. The approach also provides modular, debuggable guarantees and a path toward scalable composition of agent behaviors.
Abstract
The surge in popularity of large language models (LLMs) has opened doors for new approaches to the creation of interactive agents. However, managing and interpreting the temporal behavior of such agents over the course of a potentially infinite interaction remain challenging. The stateful, long-term horizon reasoning required for coherent agent behavior does not fit well into the LLM paradigm. We propose a combination of formal logic-based program synthesis and LLM content generation to bring guarantees of procedural adherence and interpretability to generative agent behavior. To illustrate the benefit of procedural adherence and interpretability, we use Temporal Stream Logic (TSL) to generate an automaton that enforces an interpretable, high-level temporal structure on an agent. With the automaton tracking the context of the interaction and making decisions to guide the conversation accordingly, we can drive content generation in a way that allows the LLM to focus on a shorter context window. We evaluated our approach on different tasks involved in creating an interactive agent specialized for generating choose-your-own-adventure games. We found that over all of the tasks, an automaton-enhanced agent with procedural guarantees achieves at least 96% adherence to its temporal constraints, whereas a purely LLM-based agent demonstrates as low as 14.67% adherence.
