Table of Contents
Fetching ...

What Planning Problems Can A Relational Neural Network Solve?

Jiayuan Mao, Tomás Lozano-Pérez, Joshua B. Tenenbaum, Leslie Pack Kaelbling

TL;DR

This work addresses when relational neural networks can implement goal-conditioned planning policies with polynomial-size circuits. It builds a bridge between policy realization and classical planning width by introducing serialized goal regression search (S-GRS) and the notions of regression width and SOS width, then derives upper bounds on policy-circuit size and depth as functions of these widths and planning horizon. It presents two RelNN-based compilation schemes—direct backward search and regression-rule-selector-guided compilation—that can yield finite-breadth, finite-depth circuits in domains with low width, and demonstrates depth-enabled generalization in several object-centric domains (e.g., Assembly3, Logistics, Blocks World). The results illuminate why RelNNs can generalize to larger instances in many planning tasks while clarifying the challenges in harder domains like Sokoban, guiding design choices for neural planners and suggesting extensions to hierarchical or continuous settings.

Abstract

Goal-conditioned policies are generally understood to be "feed-forward" circuits, in the form of neural networks that map from the current state and the goal specification to the next action to take. However, under what circumstances such a policy can be learned and how efficient the policy will be are not well understood. In this paper, we present a circuit complexity analysis for relational neural networks (such as graph neural networks and transformers) representing policies for planning problems, by drawing connections with serialized goal regression search (S-GRS). We show that there are three general classes of planning problems, in terms of the growth of circuit width and depth as a function of the number of objects and planning horizon, providing constructive proofs. We also illustrate the utility of this analysis for designing neural networks for policy learning.

What Planning Problems Can A Relational Neural Network Solve?

TL;DR

This work addresses when relational neural networks can implement goal-conditioned planning policies with polynomial-size circuits. It builds a bridge between policy realization and classical planning width by introducing serialized goal regression search (S-GRS) and the notions of regression width and SOS width, then derives upper bounds on policy-circuit size and depth as functions of these widths and planning horizon. It presents two RelNN-based compilation schemes—direct backward search and regression-rule-selector-guided compilation—that can yield finite-breadth, finite-depth circuits in domains with low width, and demonstrates depth-enabled generalization in several object-centric domains (e.g., Assembly3, Logistics, Blocks World). The results illuminate why RelNNs can generalize to larger instances in many planning tasks while clarifying the challenges in harder domains like Sokoban, guiding design choices for neural planners and suggesting extensions to hierarchical or continuous settings.

Abstract

Goal-conditioned policies are generally understood to be "feed-forward" circuits, in the form of neural networks that map from the current state and the goal specification to the next action to take. However, under what circumstances such a policy can be learned and how efficient the policy will be are not well understood. In this paper, we present a circuit complexity analysis for relational neural networks (such as graph neural networks and transformers) representing policies for planning problems, by drawing connections with serialized goal regression search (S-GRS). We show that there are three general classes of planning problems, in terms of the growth of circuit width and depth as a function of the number of objects and planning horizon, providing constructive proofs. We also illustrate the utility of this analysis for designing neural networks for policy learning.
Paper Structure (38 sections, 8 theorems, 4 equations, 6 figures, 3 tables, 3 algorithms)

This paper contains 38 sections, 8 theorems, 4 equations, 6 figures, 3 tables, 3 algorithms.

Key Result

Theorem 3.1

For any goal $g \in \textit{OSG}(s_0)$, S-GRS is optimal and complete. See proof in Appendix sec:supp-sgrs-completeness.

Figures (6)

  • Figure 1: (a) Illustration of the Blocks World domain that we will be using as the main example. (b) The action schema definition in Blocks World. $\textit{clear}(x)$ means there is no object on $x$. (c) A backward search tree for solving the goal $\textit{clear}( \ProcessList{B}{} )$ . (d) A serialized goal regression search tree for the same goal. (e) The form of a goal-conditioned policy for this problem.
  • Figure 2: The input and output of a relational neural network (RelNN) policy.
  • Figure 3: State-dependent regression rule selector in the Blocks World domain. For brevity, we have omitted atoms in the constraint set. All rules listed above are applicable under any constraints.
  • Figure 4: Illustration of the branching factor caused by tracking multiple resulting states after achieving a subgoal.
  • Figure 5: Illustration of the compilation of backward search into a RelNN policy.
  • ...and 1 more figures

Theorems & Definitions (19)

  • Definition 3.1: Optimal serializability
  • Theorem 3.1
  • Definition 3.2: Generalized regression rules
  • Definition 3.3: Strong optimally-serializable (SOS) width of regression rules
  • Definition 3.4: SOS width of problems
  • Theorem 3.2
  • Theorem 3.3
  • Lemma 4.1: Logical expressiveness of relational neural networks luo2022expressivenesscai1992optimal
  • Theorem 4.1: Compilation of BWD
  • Theorem 4.2: Compilation of S-GRS
  • ...and 9 more