Table of Contents
Fetching ...

Automatically Learning HTN Methods from Landmarks

Ruoxi Li, Dana Nau, Mark Roberts, Morgan Fine-Morris

TL;DR

This paper introduces CurricuLAMA, an automated framework for learning HTN methods by generating curricula from planning landmarks and applying curriculum learning. It eliminates the need for manually annotated tasks, proves soundness, and shows comparable convergence to HTN-Maker across multiple IPC domains. The approach combines CurricuGen, which builds landmark-based curricula, with CurricuLearn, which learns HTN methods from those curricula, and demonstrates efficiency with learning times well below planning times. The work suggests that landmarks can structure hierarchical knowledge learning and points to future enhancements in landmark ordering to reduce overgeneration and improve scalability.

Abstract

Hierarchical Task Network (HTN) planning usually requires a domain engineer to provide manual input about how to decompose a planning problem. Even HTN-MAKER, a well-known method-learning algorithm, requires a domain engineer to annotate the tasks with information about what to learn. We introduce CURRICULAMA, an HTN method learning algorithm that completely automates the learning process. It uses landmark analysis to compose annotated tasks and leverages curriculum learning to order the learning of methods from simpler to more complex. This eliminates the need for manual input, resolving a core issue with HTN-MAKER. We prove CURRICULAMA's soundness, and show experimentally that it has a substantially similar convergence rate in learning a complete set of methods to HTN-MAKER.

Automatically Learning HTN Methods from Landmarks

TL;DR

This paper introduces CurricuLAMA, an automated framework for learning HTN methods by generating curricula from planning landmarks and applying curriculum learning. It eliminates the need for manually annotated tasks, proves soundness, and shows comparable convergence to HTN-Maker across multiple IPC domains. The approach combines CurricuGen, which builds landmark-based curricula, with CurricuLearn, which learns HTN methods from those curricula, and demonstrates efficiency with learning times well below planning times. The work suggests that landmarks can structure hierarchical knowledge learning and points to future enhancements in landmark ordering to reduce overgeneration and improve scalability.

Abstract

Hierarchical Task Network (HTN) planning usually requires a domain engineer to provide manual input about how to decompose a planning problem. Even HTN-MAKER, a well-known method-learning algorithm, requires a domain engineer to annotate the tasks with information about what to learn. We introduce CURRICULAMA, an HTN method learning algorithm that completely automates the learning process. It uses landmark analysis to compose annotated tasks and leverages curriculum learning to order the learning of methods from simpler to more complex. This eliminates the need for manual input, resolving a core issue with HTN-MAKER. We prove CURRICULAMA's soundness, and show experimentally that it has a substantially similar convergence rate in learning a complete set of methods to HTN-MAKER.
Paper Structure (10 sections, 5 figures, 2 algorithms)

This paper contains 10 sections, 5 figures, 2 algorithms.

Figures (5)

  • Figure 1: A Blocks World problem in which the initial state is a stack of 4 blocks. The goal is to make the bottom block A clear. The plan to achieve the goal is shown on the right.
  • Figure 2: A landmark graph for clearing block A from blocks B, C and D above in the Blocks World domain. The circled nodes are landmarks, where the dashed nodes are the landmarks that are satisfied in the initial state, and the filled node is the goal. The edges are orderings among the landmarks, where 'gn' stands for greedy necessary ordering, and 'n' stands for natural ordering.
  • Figure 3: The subplans generated from the landmarks.
  • Figure 4: Experimental results in (1) the Blocks World domain and (2) the Logistics domain. From left to right, the subfigure's y-axis shows (a) the fraction of problems that the planner could successfully solve using the methods that each learning algorithm learned; (b) the average length of the plans that the planner produced using the learned methods; (c) the average planning time over the 50 test problems; and (d) the total number of methods learned. In each of the subfigures, the x-axis shows the number of training problems (0-150) from which the methods were learned. The blue line displays the results for CurricuLAMA and the orange dashed line displays the results for HTN-Maker. The shaded areas indicate the variance in the number of methods learned across five trials.
  • Figure 5: Running time needed to learn methods. The bars represent the average time that each learning algorithm spent on different parts of the learning process. Green represents the time to obtain landmarks (Alg \ref{['alg:c-lama']}, Line 2 and 3), blue indicates the time to obtain the plan (Alg \ref{['alg:c-lama']}, Line 8 to 16) , and orange shows the time to learn methods.