Table of Contents
Fetching ...

A Roadmap to Guide the Integration of LLMs in Hierarchical Planning

Israel Puerta-Merino, Carlos Núñez-Molina, Pablo Mesejo, Juan Fernández-Olivares

TL;DR

This work investigates integrating Large Language Models (LLMs) into Hierarchical Planning (HP), a subfield of Automated Planning (AP), to leverage hierarchical knowledge for improved planning efficiency. It introduces a two-dimensional taxonomy—Planning Process Role and LLM Improvement Strategy—and a standardized IPC-2023 HTN-based benchmark to evaluate HP-LLM methods, including a baseline LLM Planner and comparison to the IPC-2023 winner PandaDealer, with the latter score rooted in the ratio $C^*/C$. Initial experiments show the baseline LLM Planner yields limited performance, achieving only a small fraction of feasible and correct plans and failing to produce correct hierarchical decompositions, highlighting substantial room for improvement. The paper proposes a roadmap to guide future research, advocating knowledge-enhancement and multi-call strategies, as well as exploring additional HP lifecycle aspects to drive progress in HP-LLM integration and practical applicability.

Abstract

Recent advances in Large Language Models (LLMs) are fostering their integration into several reasoning-related fields, including Automated Planning (AP). However, their integration into Hierarchical Planning (HP), a subfield of AP that leverages hierarchical knowledge to enhance planning performance, remains largely unexplored. In this preliminary work, we propose a roadmap to address this gap and harness the potential of LLMs for HP. To this end, we present a taxonomy of integration methods, exploring how LLMs can be utilized within the HP life cycle. Additionally, we provide a benchmark with a standardized dataset for evaluating the performance of future LLM-based HP approaches, and present initial results for a state-of-the-art HP planner and LLM planner. As expected, the latter exhibits limited performance (3\% correct plans, and none with a correct hierarchical decomposition) but serves as a valuable baseline for future approaches.

A Roadmap to Guide the Integration of LLMs in Hierarchical Planning

TL;DR

This work investigates integrating Large Language Models (LLMs) into Hierarchical Planning (HP), a subfield of Automated Planning (AP), to leverage hierarchical knowledge for improved planning efficiency. It introduces a two-dimensional taxonomy—Planning Process Role and LLM Improvement Strategy—and a standardized IPC-2023 HTN-based benchmark to evaluate HP-LLM methods, including a baseline LLM Planner and comparison to the IPC-2023 winner PandaDealer, with the latter score rooted in the ratio . Initial experiments show the baseline LLM Planner yields limited performance, achieving only a small fraction of feasible and correct plans and failing to produce correct hierarchical decompositions, highlighting substantial room for improvement. The paper proposes a roadmap to guide future research, advocating knowledge-enhancement and multi-call strategies, as well as exploring additional HP lifecycle aspects to drive progress in HP-LLM integration and practical applicability.

Abstract

Recent advances in Large Language Models (LLMs) are fostering their integration into several reasoning-related fields, including Automated Planning (AP). However, their integration into Hierarchical Planning (HP), a subfield of AP that leverages hierarchical knowledge to enhance planning performance, remains largely unexplored. In this preliminary work, we propose a roadmap to address this gap and harness the potential of LLMs for HP. To this end, we present a taxonomy of integration methods, exploring how LLMs can be utilized within the HP life cycle. Additionally, we provide a benchmark with a standardized dataset for evaluating the performance of future LLM-based HP approaches, and present initial results for a state-of-the-art HP planner and LLM planner. As expected, the latter exhibits limited performance (3\% correct plans, and none with a correct hierarchical decomposition) but serves as a valuable baseline for future approaches.
Paper Structure (14 sections, 3 tables)