Table of Contents
Fetching ...

Neuro-Symbolic Procedural Planning with Commonsense Prompting

Yujie Lu, Weixi Feng, Wanrong Zhu, Wenda Xu, Xin Eric Wang, Miguel Eckstein, William Yang Wang

TL;DR

The paper tackles the difficulty of procedural planning in LLMs by treating it as a causal reasoning problem. It introduces PLAN, a neuro-symbolic planner that uses commonsense-infused prompts to implement a front-door causal intervention via a mediator P_i constructed from external knowledge, enabling zero-shot planning without exemplars. Through a three-stage prompt construction pipeline and a translation/generation/ grounding process, PLAN achieves superior performance on WikiHow and RobotHow in both automatic and human evaluations, including robustness to counterfactual task variants. The work demonstrates the potential of combining structural causal models with neuro-symbolic grounding to enhance long-horizon procedural reasoning in language models, with implications for embodied agents and virtual assistants.

Abstract

Procedural planning aims to implement complex high-level goals by decomposition into sequential simpler low-level steps. Although procedural planning is a basic skill set for humans in daily life, it remains a challenge for large language models (LLMs) that lack a deep understanding of the cause-effect relations in procedures. Previous methods require manual exemplars to acquire procedural planning knowledge from LLMs in the zero-shot setting. However, such elicited pre-trained knowledge in LLMs induces spurious correlations between goals and steps, which impair the model generalization to unseen tasks. In contrast, this paper proposes a neuro-symbolic procedural PLANner (PLAN) that elicits procedural planning knowledge from the LLMs with commonsense-infused prompting. To mitigate spurious goal-step correlations, we use symbolic program executors on the latent procedural representations to formalize prompts from commonsense knowledge bases as a causal intervention toward the Structural Causal Model. Both automatic and human evaluations on WikiHow and RobotHow show the superiority of PLAN on procedural planning without further training or manual exemplars.

Neuro-Symbolic Procedural Planning with Commonsense Prompting

TL;DR

The paper tackles the difficulty of procedural planning in LLMs by treating it as a causal reasoning problem. It introduces PLAN, a neuro-symbolic planner that uses commonsense-infused prompts to implement a front-door causal intervention via a mediator P_i constructed from external knowledge, enabling zero-shot planning without exemplars. Through a three-stage prompt construction pipeline and a translation/generation/ grounding process, PLAN achieves superior performance on WikiHow and RobotHow in both automatic and human evaluations, including robustness to counterfactual task variants. The work demonstrates the potential of combining structural causal models with neuro-symbolic grounding to enhance long-horizon procedural reasoning in language models, with implications for embodied agents and virtual assistants.

Abstract

Procedural planning aims to implement complex high-level goals by decomposition into sequential simpler low-level steps. Although procedural planning is a basic skill set for humans in daily life, it remains a challenge for large language models (LLMs) that lack a deep understanding of the cause-effect relations in procedures. Previous methods require manual exemplars to acquire procedural planning knowledge from LLMs in the zero-shot setting. However, such elicited pre-trained knowledge in LLMs induces spurious correlations between goals and steps, which impair the model generalization to unseen tasks. In contrast, this paper proposes a neuro-symbolic procedural PLANner (PLAN) that elicits procedural planning knowledge from the LLMs with commonsense-infused prompting. To mitigate spurious goal-step correlations, we use symbolic program executors on the latent procedural representations to formalize prompts from commonsense knowledge bases as a causal intervention toward the Structural Causal Model. Both automatic and human evaluations on WikiHow and RobotHow show the superiority of PLAN on procedural planning without further training or manual exemplars.
Paper Structure (49 sections, 9 equations, 8 figures, 16 tables, 1 algorithm)

This paper contains 49 sections, 9 equations, 8 figures, 16 tables, 1 algorithm.

Figures (8)

  • Figure 1: Two independant procedural planning task examples from RobotHow and WikiHow. PLAN construct commonsense-infused prompt from external knowledge (e.g., ConceptNet) to elicit procedural planning ability of the Large Language Models (LLMs) without training or exemplars.
  • Figure 2: Structural Causal Model (SCM) for Procedural Planning. (a) The full temporal causal graph. $T$ denotes the task query, and $S_i$ is the sub-goal step at timestep $i$. $D$ is the unobservable confounding variable introduced by the LLMs. $P_i$ denotes the mediating variables we construct to mitigate the spurious correlation. (b) The SCM at timestep $i$. Without causal intervention, the model produces a sub-goal step "find television" due to the spurious correlation between "television" and "living room" caused by the confounding variable $D$. With our causal intervention, the constructed mediating variable $P_i$ (Section \ref{['sec:prompt_construction']}) can block the backdoor paths for $T\rightarrow S_i$ and $S_{i-1}\rightarrow S_{i}$ (opened by $D$) and generate the causal sub-goal "find book" precisely (Section \ref{['sec:method_languageplanning']}).
  • Figure 3: The Overview of Procedural Planning. Our five-stage pipeline includes: 1) semantically parsing the task $T$ into entity set $T_{E}$ to retrieve subgraph $G_s$ from the external knowledge base $G$. 2) formalize procedural prompt $P_G$ and then translate into the admissible one $\hat{P_G}$. 3) aggregate task, previous steps and $P_G$ as final commonsense-infused prompt $P$. (Section \ref{['sec:prompt_construction']}) 4) and 5) generating and translating time-extended procedural plan until triggering the termination condition. (Section \ref{['sec:method_languageplanning']})
  • Figure 4: The front-door Adjustment for Causal Procedural Planner. (a) the structural causal model at timestamp $i=1$. $T$ denotes the task name and $S_1$ denotes the step at timestep $1$. $D$ is the unobservable confounding variable introduced by the pre-training data. $P_1$ denotes the mediating variables we construct to mitigate the spurious correlation at timestep $1$. (b) $D$ opens up backdoor paths for $T\rightarrow S_i$, $P_{i-1}\rightarrow S_i$ and $S_{i-1}\rightarrow S_{i}$ which can be blocked by introducing $P_i$. path $1$ and path $2$ share the same path $D\rightarrow T$. Intervention on $T$ blocks $D\rightarrow T$ and the backdoor path $2$. Intervention on $S_{i-1}$ blocks $D\rightarrow S_{i-1}$ and the backdoor path $3$. (c) the structural causal model at timestamp $i>1$ after simplification based on Equation \ref{['ft1']}-\ref{['ft5']}.
  • Figure 5: The Causal Graph after $do$-operation. (a) the causal graph transition of Structural Causal Model at timestamp $i=1$. (b) the causal graph transition of Structural Causal Model at timestamp $i>1$.
  • ...and 3 more figures