Action Model Learning with Guarantees
Diego Aineto, Enrico Scala
TL;DR
This work develops a version-space-based framework for action model learning under full observability, introducing the VSLAM online algorithm to maintain all consistent hypotheses about action preconditions and effects. By manipulating the resulting lower and upper boundaries, it derives sound models (guaranteed safe transitions) and complete models (capable of non-deterministic planning), and proves that with sufficient demonstrations both formulations converge to the true dynamics. The theoretical contribution connects version-space theory with action-model learning, while the empirical evaluation across multiple IPC domains demonstrates complementary strengths: sound models enable early safe reasoning, and complete models enable broader exploratory planning. The approach offers a principled, adaptable pathway to learn reliable planning dynamics from demonstrations, including failure data, and points to future work on richer action representations, noise tolerance, and active data selection.
Abstract
This paper studies the problem of action model learning with full observability. Following the learning by search paradigm by Mitchell, we develop a theory for action model learning based on version spaces that interprets the task as search for hypothesis that are consistent with the learning examples. Our theoretical findings are instantiated in an online algorithm that maintains a compact representation of all solutions of the problem. Among these range of solutions, we bring attention to actions models approximating the actual transition system from below (sound models) and from above (complete models). We show how to manipulate the output of our learning algorithm to build deterministic and non-deterministic formulations of the sound and complete models and prove that, given enough examples, both formulations converge into the very same true model. Our experiments reveal their usefulness over a range of planning domains.
