The Case for Developing a Foundation Model for Planning-like Tasks from Scratch
Biplav Srivastava, Vishal Pallagani
TL;DR
The paper argues for developing a Planning Foundation Model from scratch to tackle planning-like tasks (PL) beyond traditional Automated Planning and Scheduling (APS). It proposes a Seq2Seq Planning FM with domain-specific tokenization and RoPE, and novel pre-training objectives to learn execution semantics, for example treating CPP as $\\mathcal{M}=\\langle \\mathcal{D}, \\mathcal{I}, \\mathcal{G} \\rangle$ with $\\mathcal{D}=\\langle F,A\\rangle$ and $\\delta_{\\mathcal{M}}(s,a)$ guiding plan execution. It envisions a diverse PL corpus and evaluation metrics like Plan Validity and Plan Optimality to quantify planning capabilities, aiming for generalization across tasks such as plan generation, replanning, and plan summarization. The authors emphasize grounding, alignment, and instructability as essential properties, and argue that a Planning FM could generalize across domains (business processes, dialogs, CAD workflows) and advance reliable, executable PL solutions in a way analogous to LLM-driven progress in APS.
Abstract
Foundation Models (FMs) have revolutionized many areas of computing, including Automated Planning and Scheduling (APS). For example, a recent study found them useful for planning problems: plan generation, language translation, model construction, multi-agent planning, interactive planning, heuristics optimization, tool integration, and brain-inspired planning. Besides APS, there are many seemingly related tasks involving the generation of a series of actions with varying guarantees of their executability to achieve intended goals, which we collectively call planning-like (PL) tasks like business processes, programs, workflows, and guidelines, where researchers have considered using FMs. However, previous works have primarily focused on pre-trained, off-the-shelf FMs and optionally fine-tuned them. This paper discusses the need for a comprehensive FM for PL tasks from scratch and explores its design considerations. We argue that such an FM will open new and efficient avenues for PL problem-solving, just like LLMs are creating for APS.
