Table of Contents
Fetching ...

NL2GenSym: Natural Language to Generative Symbolic Rules for SOAR Cognitive Architecture via Large Language Models

Fang Yuan, Junjie Zeng, Yue Hu, Zhengqiu Zhu, Quanjun Yin, Yuxiang Xie

TL;DR

NL2GenSym presents a novel, end-to-end framework that translates natural language into executable symbolic rules for the SOAR cognitive architecture, addressing the labor-intensive rule-coding bottleneck. The approach combines a Self-Evolving Domain Knowledge Base with an Execution-Grounded Generator-Critic loop, where the Generator emits SOAR rules, the SOAR execution validates them, and the Critic provides targeted natural-language feedback for iterative refinement. Experimental validation on a Water Jug Problem dataset across Gemini and Qwen models demonstrates high rule-generation success and emergent, high-efficiency heuristics that significantly reduce decision cycles, with smaller models sometimes surpassing larger ones when guided by the framework. These findings suggest that well-structured neuro-symbolic architectures with execution-grounded feedback can outperform mere scale, offering a practical pathway toward scalable, rule-based autonomous agents and contributing to the broader pursuit of robust LLM-enabled cognitive architectures.

Abstract

SOAR, a classic symbol-based cognitive architecture, has been fostering the development of general, human-like intelligent agents. Nevertheless, its practical adoption is hindered by the laborious manual rule coding. Emerging Large Language Models (LLMs) present the immense potential for efficient rules generation. However, there is a critical gap that current research predominantly focuses on conceptual frameworks and lacks robust experimental validation. To bridge this gap, we propose \textit{N}atural \textit{L}anguage to \textit{Gen}erative \textit{Sym}bolic Rules (NL2GenSym), a novel framework that integrates LLMs with SOAR to autonomously produce generative symbolic rules from natural language. Specifically, our framework introduces a novel Execution-Grounded Generator-Critic mechanism. The LLM-based Generator, guided by a Retrieval-Augmented Generation-accessed self-evolving domain knowledge base, proposes rules from natural language. Subsequently, these rules are immediately executed within the SOAR environment to rigorously validate their correctness. Based on this execution-grounded feedback, a reflective LLM-based Critic drives the iterative refinement of these rules. Experiments on our specialized Water Jug Problem (WJP) dataset, utilizing both Gemini and Qwen series models, validate the efficacy of our framework. It achieves a success rate over 86\% in generating rules from natural language. Crucially, the framework also generates novel heuristic rules, reducing average decision cycles for solving the WJP to 1.98 times the optimal solution and 1/1000 of baseline methods. Additionally, our initial experiments show that NL2GenSym enables smaller-parameter models to achieve better performance than larger counterparts.

NL2GenSym: Natural Language to Generative Symbolic Rules for SOAR Cognitive Architecture via Large Language Models

TL;DR

NL2GenSym presents a novel, end-to-end framework that translates natural language into executable symbolic rules for the SOAR cognitive architecture, addressing the labor-intensive rule-coding bottleneck. The approach combines a Self-Evolving Domain Knowledge Base with an Execution-Grounded Generator-Critic loop, where the Generator emits SOAR rules, the SOAR execution validates them, and the Critic provides targeted natural-language feedback for iterative refinement. Experimental validation on a Water Jug Problem dataset across Gemini and Qwen models demonstrates high rule-generation success and emergent, high-efficiency heuristics that significantly reduce decision cycles, with smaller models sometimes surpassing larger ones when guided by the framework. These findings suggest that well-structured neuro-symbolic architectures with execution-grounded feedback can outperform mere scale, offering a practical pathway toward scalable, rule-based autonomous agents and contributing to the broader pursuit of robust LLM-enabled cognitive architectures.

Abstract

SOAR, a classic symbol-based cognitive architecture, has been fostering the development of general, human-like intelligent agents. Nevertheless, its practical adoption is hindered by the laborious manual rule coding. Emerging Large Language Models (LLMs) present the immense potential for efficient rules generation. However, there is a critical gap that current research predominantly focuses on conceptual frameworks and lacks robust experimental validation. To bridge this gap, we propose \textit{N}atural \textit{L}anguage to \textit{Gen}erative \textit{Sym}bolic Rules (NL2GenSym), a novel framework that integrates LLMs with SOAR to autonomously produce generative symbolic rules from natural language. Specifically, our framework introduces a novel Execution-Grounded Generator-Critic mechanism. The LLM-based Generator, guided by a Retrieval-Augmented Generation-accessed self-evolving domain knowledge base, proposes rules from natural language. Subsequently, these rules are immediately executed within the SOAR environment to rigorously validate their correctness. Based on this execution-grounded feedback, a reflective LLM-based Critic drives the iterative refinement of these rules. Experiments on our specialized Water Jug Problem (WJP) dataset, utilizing both Gemini and Qwen series models, validate the efficacy of our framework. It achieves a success rate over 86\% in generating rules from natural language. Crucially, the framework also generates novel heuristic rules, reducing average decision cycles for solving the WJP to 1.98 times the optimal solution and 1/1000 of baseline methods. Additionally, our initial experiments show that NL2GenSym enables smaller-parameter models to achieve better performance than larger counterparts.

Paper Structure

This paper contains 30 sections, 3 equations, 12 figures, 6 tables.

Figures (12)

  • Figure 1: Basic architecture of the SOAR cognitive model laird2012soar.
  • Figure 2: Problem space representation of the WJP. Each square (x, y) represents a state, corresponding to the water volume in the 5 L and 3 L jugs, respectively. The initial state (0,0) is denoted by the blue square, and a target goal state (any state containing 1 L) is indicated by the green square. The colored arrows depict the state transitions performed by distinct Operators. These Operators include Fill (jug) (to fill a jug to its volume), Empty (jug) (to empty its contents), and Pour (jug_from, jug_to) (to pour water between jugs until the source is empty or the destination is full). The path highlighted by dashed arrows illustrates the optimum solution, corresponding to five decision cycles in SOAR.
  • Figure 3: The decision cycle in SOAR.
  • Figure 4: Operational workflow of the NL2GenSym framework, illustrated with the WJP. Natural language WJP descriptions are processed by NGS-G (b), using the Self-Evolving Domain Knowledge Base (a) via RAG, to produce symbolic rules. These rules undergo N execution cycles in the SOAR. Comprehensive feedback—including execution traces, performance data, the generated rules, and the original problem description—is then analyzed by NGS-C (c), which also leverages the Knowledge Base (a) via RAG. The NGS-C (c) outputs natural language optimization suggestions that are fed back to NGS-G (b). Additionally, the case knowledge is iteratively updated by storing currently optimal historical cases from SOAR execution as high-quality "seed" rules within the Knowledge Base (a).
  • Figure 5: Results of ablation studies for Success Rate (%) metrics.
  • ...and 7 more figures