Table of Contents
Fetching ...

Action Model Learning with Guarantees

Diego Aineto, Enrico Scala

TL;DR

This work develops a version-space-based framework for action model learning under full observability, introducing the VSLAM online algorithm to maintain all consistent hypotheses about action preconditions and effects. By manipulating the resulting lower and upper boundaries, it derives sound models (guaranteed safe transitions) and complete models (capable of non-deterministic planning), and proves that with sufficient demonstrations both formulations converge to the true dynamics. The theoretical contribution connects version-space theory with action-model learning, while the empirical evaluation across multiple IPC domains demonstrates complementary strengths: sound models enable early safe reasoning, and complete models enable broader exploratory planning. The approach offers a principled, adaptable pathway to learn reliable planning dynamics from demonstrations, including failure data, and points to future work on richer action representations, noise tolerance, and active data selection.

Abstract

This paper studies the problem of action model learning with full observability. Following the learning by search paradigm by Mitchell, we develop a theory for action model learning based on version spaces that interprets the task as search for hypothesis that are consistent with the learning examples. Our theoretical findings are instantiated in an online algorithm that maintains a compact representation of all solutions of the problem. Among these range of solutions, we bring attention to actions models approximating the actual transition system from below (sound models) and from above (complete models). We show how to manipulate the output of our learning algorithm to build deterministic and non-deterministic formulations of the sound and complete models and prove that, given enough examples, both formulations converge into the very same true model. Our experiments reveal their usefulness over a range of planning domains.

Action Model Learning with Guarantees

TL;DR

This work develops a version-space-based framework for action model learning under full observability, introducing the VSLAM online algorithm to maintain all consistent hypotheses about action preconditions and effects. By manipulating the resulting lower and upper boundaries, it derives sound models (guaranteed safe transitions) and complete models (capable of non-deterministic planning), and proves that with sufficient demonstrations both formulations converge to the true dynamics. The theoretical contribution connects version-space theory with action-model learning, while the empirical evaluation across multiple IPC domains demonstrates complementary strengths: sound models enable early safe reasoning, and complete models enable broader exploratory planning. The approach offers a principled, adaptable pathway to learn reliable planning dynamics from demonstrations, including failure data, and points to future work on richer action representations, noise tolerance, and active data selection.

Abstract

This paper studies the problem of action model learning with full observability. Following the learning by search paradigm by Mitchell, we develop a theory for action model learning based on version spaces that interprets the task as search for hypothesis that are consistent with the learning examples. Our theoretical findings are instantiated in an online algorithm that maintains a compact representation of all solutions of the problem. Among these range of solutions, we bring attention to actions models approximating the actual transition system from below (sound models) and from above (complete models). We show how to manipulate the output of our learning algorithm to build deterministic and non-deterministic formulations of the sound and complete models and prove that, given enough examples, both formulations converge into the very same true model. Our experiments reveal their usefulness over a range of planning domains.
Paper Structure (18 sections, 12 theorems, 3 figures, 1 table, 1 algorithm)

This paper contains 18 sections, 12 theorems, 3 figures, 1 table, 1 algorithm.

Key Result

Theorem 1

Let $\mathcal{V}_{\mathcal{H}^a_p,D_p}$ and $\mathcal{V}_{\mathcal{H}^a_e,D_e}$ be the version spaces of preconditions and effects of $a \in A$. The action model $M = \langle F,A,\mathsf{pre},\mathsf{eff} \rangle$ belongs to $\mathcal{M}_D$ if and only if $\forall a \in A : \mathsf{pre}(a) \in \math

Figures (3)

  • Figure 1: Version spaces and version space learning.
  • Figure 2: Distinct positive demonstrations collected with the original (top) and modified domains (bottom).
  • Figure 3: F1-score (y-axis) of the sound and complete action models as the training demonstrations increase (x-axis), with different positive vs negative ratios

Theorems & Definitions (31)

  • Definition 1: Action Model
  • Definition 2: Demonstration
  • Definition 3: Action Model Learning Problem
  • Definition 4: Hypothesis and Hypothesis Spaces
  • Definition 5: Learning Example
  • Definition 6: Version Space
  • Theorem 1
  • proof
  • Theorem 2: Update rules for $\mathcal{V}_{\mathcal{H}^a_p}$
  • proof
  • ...and 21 more