Table of Contents
Fetching ...

Deep Learning for Generalised Planning with Background Knowledge

Dillon Z. Chen, Rostislav Horčík, Gustav Šír

TL;DR

This paper proposes a new ML approach that allows users to specify background knowledge (BK) through Datalog rules to guide both the learning and planning processes in an integrated fashion and bypasses the need to relearn how to solve problems from scratch and instead focuses the learning on plan quality optimisation.

Abstract

Automated planning is a form of declarative problem solving which has recently drawn attention from the machine learning (ML) community. ML has been applied to planning either as a way to test `reasoning capabilities' of architectures, or more pragmatically in an attempt to scale up solvers with learned domain knowledge. In practice, planning problems are easy to solve but hard to optimise. However, ML approaches still struggle to solve many problems that are often easy for both humans and classical planners. In this paper, we thus propose a new ML approach that allows users to specify background knowledge (BK) through Datalog rules to guide both the learning and planning processes in an integrated fashion. By incorporating BK, our approach bypasses the need to relearn how to solve problems from scratch and instead focuses the learning on plan quality optimisation. Experiments with BK demonstrate that our method successfully scales and learns to plan efficiently with high quality solutions from small training data generated in under 5 seconds.

Deep Learning for Generalised Planning with Background Knowledge

TL;DR

This paper proposes a new ML approach that allows users to specify background knowledge (BK) through Datalog rules to guide both the learning and planning processes in an integrated fashion and bypasses the need to relearn how to solve problems from scratch and instead focuses the learning on plan quality optimisation.

Abstract

Automated planning is a form of declarative problem solving which has recently drawn attention from the machine learning (ML) community. ML has been applied to planning either as a way to test `reasoning capabilities' of architectures, or more pragmatically in an attempt to scale up solvers with learned domain knowledge. In practice, planning problems are easy to solve but hard to optimise. However, ML approaches still struggle to solve many problems that are often easy for both humans and classical planners. In this paper, we thus propose a new ML approach that allows users to specify background knowledge (BK) through Datalog rules to guide both the learning and planning processes in an integrated fashion. By incorporating BK, our approach bypasses the need to relearn how to solve problems from scratch and instead focuses the learning on plan quality optimisation. Experiments with BK demonstrate that our method successfully scales and learns to plan efficiently with high quality solutions from small training data generated in under 5 seconds.

Paper Structure

This paper contains 28 sections, 1 equation, 4 figures, 1 table.

Figures (4)

  • Figure 1: Outline of the proposed approach. A domain and background knowledge is used to construct a Datalog program representing a generalised policy. The program can be extended with message passing rules into a parameterised LRNN trained to optimise plan quality.
  • Figure 2: Visualisation of the LRNN architecture. Logical representations of two states (left) form inputs into the Datalog program (middle) that induces differentiable computation graphs (right, partially displayed) for predicting action scores.
  • Figure 3: Range of task sizes in terms of the number of objects in the training and testing tasks.
  • Figure 4: Average plan length improvement (PLI), computed by Eqn. \ref{['eqn:pli']}, over the baseline BK policies ($y$-axis) across tasks of increasing difficulty ($x$-axis). Baselines and planners are denoted with solid lines, and LRNN models with dotted lines.

Theorems & Definitions (5)

  • Definition 3.1: Non-deterministic Policy
  • Definition 3.2: Datalog Program Induced Policy
  • Example 3.3: Applicable Actions
  • Example 3.4: Blocksworld
  • Example 3.5: Satellite