Table of Contents
Fetching ...

Graph Neural Network Based Action Ranking for Planning

Rajesh Mangannavar, Stefan Lee, Alan Fern, Prasad Tadepalli

TL;DR

Graph Neural Network Based Action Ranking for Planning (GABAR) tackles the challenge of scalable planning by shifting from learning global value functions to directly ranking actions in each state. It introduces an action-centric graph representation and a GNN encoder paired with a GRU-based autoregressive decoder to construct grounded actions, with beam search enabling exploration of multiple candidates. Trained on small, solvable instances, GABAR generalizes to significantly larger problems and outperforms baselines such as GPL, ASNets, and GRAPL in both coverage and plan quality, as well as surpassing large-language-model prompts in planning tasks. The work demonstrates that local action ranking on relational graphs yields robust generalization and practical planning performance, suggesting a promising direction for scalable, relational policy learning in classical planning.

Abstract

We propose a novel approach to learn relational policies for classical planning based on learning to rank actions. We introduce a new graph representation that explicitly captures action information and propose a Graph Neural Network (GNN) architecture augmented with Gated Recurrent Units (GRUs) to learn action rankings. Unlike value-function based approaches that must learn a globally consistent function, our action ranking method only needs to learn locally consistent ranking. Our model is trained on data generated from small problem instances that are easily solved by planners and is applied to significantly larger instances where planning is computationally prohibitive. Experimental results across standard planning benchmarks demonstrate that our action-ranking approach not only achieves better generalization to larger problems than those used in training but also outperforms multiple baselines (value function and action ranking) methods in terms of success rate and plan quality.

Graph Neural Network Based Action Ranking for Planning

TL;DR

Graph Neural Network Based Action Ranking for Planning (GABAR) tackles the challenge of scalable planning by shifting from learning global value functions to directly ranking actions in each state. It introduces an action-centric graph representation and a GNN encoder paired with a GRU-based autoregressive decoder to construct grounded actions, with beam search enabling exploration of multiple candidates. Trained on small, solvable instances, GABAR generalizes to significantly larger problems and outperforms baselines such as GPL, ASNets, and GRAPL in both coverage and plan quality, as well as surpassing large-language-model prompts in planning tasks. The work demonstrates that local action ranking on relational graphs yields robust generalization and practical planning performance, suggesting a promising direction for scalable, relational policy learning in classical planning.

Abstract

We propose a novel approach to learn relational policies for classical planning based on learning to rank actions. We introduce a new graph representation that explicitly captures action information and propose a Graph Neural Network (GNN) architecture augmented with Gated Recurrent Units (GRUs) to learn action rankings. Unlike value-function based approaches that must learn a globally consistent function, our action ranking method only needs to learn locally consistent ranking. Our model is trained on data generated from small problem instances that are easily solved by planners and is applied to significantly larger instances where planning is computationally prohibitive. Experimental results across standard planning benchmarks demonstrate that our action-ranking approach not only achieves better generalization to larger problems than those used in training but also outperforms multiple baselines (value function and action ranking) methods in terms of success rate and plan quality.

Paper Structure

This paper contains 22 sections, 2 figures, 6 tables, 2 algorithms.

Figures (2)

  • Figure 1: GABAR's architecture for action extraction. (a) Graph representation: The input PDDL problem is converted into a graph with four types of nodes (predicate, object, action schema, and global) connected by predicate-object and action-object edges that encode state and grounded action information. (b) GNN encoder: Processes the graph through $L$ rounds of message passing where edge, node, and global representations are sequentially updated (c) Action decoder: Uses the final global embedding to construct a grounded action through a GRU-based decoder sequentially - first selecting an action schema, then iteratively choosing objects for each parameter position until a complete grounded action is formed.
  • Figure 2: Example graph construction for a simplified blocksworld problem with only $on$ and $clear$ predicates. The left side shows the starting and goal states. The $O$ nodes are the object nodes. In the start state, $O3$ is on $O2$ which is on the table, and $O1$ is on the table. In the goal state, $O3$ is on $O1$ which is on the table, and $O2$ is on the table. The right side shows the constructed graph with action nodes (blue), object nodes (yellow), and predicate nodes (red). The "Pick-up" action connects to object $O1$, while the "Unstack" action connects to objects $O2$ and $O3$. Predicate nodes show the current state ("Clear" for $O1$ and $O3$, "On (O3,O2)") and goal state ("G-ON(O3,O1)"). Blue edges represent action-object connections, while red edges represent predicate-object relationships.