NL2GenSym: Natural Language to Generative Symbolic Rules for SOAR Cognitive Architecture via Large Language Models
Fang Yuan, Junjie Zeng, Yue Hu, Zhengqiu Zhu, Quanjun Yin, Yuxiang Xie
TL;DR
NL2GenSym presents a novel, end-to-end framework that translates natural language into executable symbolic rules for the SOAR cognitive architecture, addressing the labor-intensive rule-coding bottleneck. The approach combines a Self-Evolving Domain Knowledge Base with an Execution-Grounded Generator-Critic loop, where the Generator emits SOAR rules, the SOAR execution validates them, and the Critic provides targeted natural-language feedback for iterative refinement. Experimental validation on a Water Jug Problem dataset across Gemini and Qwen models demonstrates high rule-generation success and emergent, high-efficiency heuristics that significantly reduce decision cycles, with smaller models sometimes surpassing larger ones when guided by the framework. These findings suggest that well-structured neuro-symbolic architectures with execution-grounded feedback can outperform mere scale, offering a practical pathway toward scalable, rule-based autonomous agents and contributing to the broader pursuit of robust LLM-enabled cognitive architectures.
Abstract
SOAR, a classic symbol-based cognitive architecture, has been fostering the development of general, human-like intelligent agents. Nevertheless, its practical adoption is hindered by the laborious manual rule coding. Emerging Large Language Models (LLMs) present the immense potential for efficient rules generation. However, there is a critical gap that current research predominantly focuses on conceptual frameworks and lacks robust experimental validation. To bridge this gap, we propose \textit{N}atural \textit{L}anguage to \textit{Gen}erative \textit{Sym}bolic Rules (NL2GenSym), a novel framework that integrates LLMs with SOAR to autonomously produce generative symbolic rules from natural language. Specifically, our framework introduces a novel Execution-Grounded Generator-Critic mechanism. The LLM-based Generator, guided by a Retrieval-Augmented Generation-accessed self-evolving domain knowledge base, proposes rules from natural language. Subsequently, these rules are immediately executed within the SOAR environment to rigorously validate their correctness. Based on this execution-grounded feedback, a reflective LLM-based Critic drives the iterative refinement of these rules. Experiments on our specialized Water Jug Problem (WJP) dataset, utilizing both Gemini and Qwen series models, validate the efficacy of our framework. It achieves a success rate over 86\% in generating rules from natural language. Crucially, the framework also generates novel heuristic rules, reducing average decision cycles for solving the WJP to 1.98 times the optimal solution and 1/1000 of baseline methods. Additionally, our initial experiments show that NL2GenSym enables smaller-parameter models to achieve better performance than larger counterparts.
