Graph-enhanced Large Language Models in Asynchronous Plan Reasoning
Fangru Lin, Emanuele La Malfa, Valentin Hofmann, Elle Michelle Yang, Anthony Cohn, Janet B. Pierrehumbert
TL;DR
The paper addresses asynchronous planning by formalizing it as a longest-path problem on a DAG and introducing a dedicated benchmark, AsyncHow. It proposes Plan Like a Graph (PLaG), a graph-enhanced prompting technique that improves LLM performance across models and task complexities, achieving state-of-the-art results but revealing persistent degradation as complexity grows. The study provides a formal complexity measure that predicts performance, conducts extensive cross-model experiments, and offers rich analyses (ablations, out-of-distribution probes, qualitative cases) to understand the limits of LLMs as autonomous planning devices. Overall, while PLaG boosts capabilities, the results highlight fundamental scalability limits and motivate further exploration of graph-informed representations and multimodal data for robust autonomous planning.
Abstract
Planning is a fundamental property of human intelligence. Reasoning about asynchronous plans is challenging since it requires sequential and parallel planning to optimize time costs. Can large language models (LLMs) succeed at this task? Here, we present the first large-scale study investigating this question. We find that a representative set of closed and open-source LLMs, including GPT-4 and LLaMA-2, behave poorly when not supplied with illustrations about the task-solving process in our benchmark AsyncHow. We propose a novel technique called Plan Like a Graph (PLaG) that combines graphs with natural language prompts and achieves state-of-the-art results. We show that although PLaG can boost model performance, LLMs still suffer from drastic degradation when task complexity increases, highlighting the limits of utilizing LLMs for simulating digital devices. We see our study as an exciting step towards using LLMs as efficient autonomous agents. Our code and data are available at https://github.com/fangru-lin/graph-llm-asynchow-plan.
