Symmetries and Expressive Requirements for Learning General Policies

Dominik Drexler; Simon Ståhlberg; Blai Bonet; Hector Geffner

Symmetries and Expressive Requirements for Learning General Policies

Dominik Drexler, Simon Ståhlberg, Blai Bonet, Hector Geffner

TL;DR

This work addresses the problem of detecting symmetries in planning and generalized planning and uses the results to assess the expressive requirements for learning general policies over various planning domains.

Abstract

State symmetries play an important role in planning and generalized planning. In the first case, state symmetries can be used to reduce the size of the search; in the second, to reduce the size of the training set. In the case of general planning, however, it is also critical to distinguish non-symmetric states, i.e., states that represent non-isomorphic relational structures. However, while the language of first-order logic distinguishes non-symmetric states, the languages and architectures used to represent and learn general policies do not. In particular, recent approaches for learning general policies use state features derived from description logics or learned via graph neural networks (GNNs) that are known to be limited by the expressive power of C_2, first-order logic with two variables and counting. In this work, we address the problem of detecting symmetries in planning and generalized planning and use the results to assess the expressive requirements for learning general policies over various planning domains. For this, we map planning states to plain graphs, run off-the-shelf algorithms to determine whether two states are isomorphic with respect to the goal, and run coloring algorithms to determine if C_2 features computed logically or via GNNs distinguish non-isomorphic states. Symmetry detection results in more effective learning, while the failure to detect non-symmetries prevents general policies from being learned at all in certain domains.

Symmetries and Expressive Requirements for Learning General Policies

TL;DR

Abstract

Paper Structure (14 sections, 8 theorems, 2 equations, 5 figures, 2 tables)

This paper contains 14 sections, 8 theorems, 2 equations, 5 figures, 2 tables.

Introduction
Related Work
Background
Classical Planning
Generalized Planning
States, Relational Structures, and Graphs
Abstractions
Isomorphic Relational Structures (States)
Computing The Abstraction
Abstractions and Domain Expressivity
Experiments: Domain Expressivity
Experiments: Learning on Abstractions
Discussion
Conclusions

Key Result

Theorem 3

Let $\mathcal{Q}/\!\!\sim\xspace$ be a faithful abstraction, and let $P$ be a problem in $\mathcal{Q}$. Then, 1) if $s_0,s_1,\ldots,s_n$ is a trajectory in $S_P$, then $[s_0],[s_1],\ldots,[s_n]$ is a trajectory in $\tilde{S}_P$, and 2) if $[s_0],[s_1],\ldots,[s_n]$ is a trajectory in $\tilde{S}_P$,

Figures (5)

Figure 1: Fragment of the state model $\tilde{S}_P$ for a Gripper instance with $n$ balls. Each equivalence class is identified by the number of balls at room A ($\#A$), the number of balls being held ($\#G$), and the position of the robot ($L$). For better understanding, we label transition with the action schemas that induce them. The abstraction contains $6n$ abstract states (see text).
Figure 2: Object graph $G(s)$ for a state $s$ in a Gripper instance with grippers L and R, one ball b, and two rooms A and B. In the state $s$, the robot is at B, the ball is at gripper R, and the goal is for the ball to be in room B. The state specifies the goal using the goal predicate $\mathit{at}_g$. This graph is isomorphic to the graph $G(t)$ for a state $t$ that is like $s$ except that the ball is at gripper L.
Figure 3: Example of two Barman states with different $V^*$ value from the same instance that are considered isomorphic by $1$-WL with respect to the goal. The left (resp. right) one in being held in the left (resp. right) hand, and the shaker (omitted) is on the table. The goal is to have cocktail $c_1$ in shot glass $s_1$ and $c_2$ in $s_2$. The only difference in both states is that the ingredients in both shots are swapped. However, in the state on the right, the ingredient $i_2$ in $s_1$ is wrong and must be removed, resulting in different $V^*$ values.
Figure 4: Example of two Blocks states that are considered isomorphic by $1$-WL with respect to the goal. In the object graphs, $1$-WL cannot determine whether the goal holds.
Figure 5: An example of two Grid states that are considered isomorphic by the $1$-WL algorithm with respect to the goal. The goal is to move the keys $k_1$ and $k_2$ to specific cells, as the arrows indicate. All keys and locks have the same shape. The agent $a$ is in the center of the grid. In the left state, $10$ actions are needed to solve the instance, while $12$ actions are needed in the right state.

Theorems & Definitions (26)

Definition 1: Abstraction
Definition 2: Faithful Abstractions
Theorem 3: Bisimulation
proof
proof
Corollary 4
Definition 5: Uniform Policies
Theorem 6: Solvability
proof
proof
...and 16 more

Symmetries and Expressive Requirements for Learning General Policies

TL;DR

Abstract

Symmetries and Expressive Requirements for Learning General Policies

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (5)

Theorems & Definitions (26)