Conversational Planning for Personal Plans
Konstantina Christakopoulou, Iris Qu, John Canny, Andrew Goodridge, Cj Adams, Minmin Chen, Maja Matarić
TL;DR
Long-horizon real-life goals require conversational agents that can plan across multiple sessions. The paper introduces a language-based hierarchical framework in which an LLM acts as a meta-controller to select discrete macro-actions from the set $Z = {add-steps, alter-steps, ask-question}$, with execution performed by tool-enabled sub-policies and low-level policies guided by Chain-of-Thought prompting and user feedback. The authors provide a detailed architectural blueprint and qualitative demonstrations across learning and health domains, including tutoring and coaching tasks, illustrating adaptive planning and plan evolution through user interaction. This approach advances text-based agents toward sustained, personalized collaboration with users over extended timelines, enabling dynamic refinement of plans as goals evolve.
Abstract
The language generation and reasoning capabilities of large language models (LLMs) have enabled conversational systems with impressive performance in a variety of tasks, from code generation, to composing essays, to passing STEM and legal exams, to a new paradigm for knowledge search. Besides those short-term use applications, LLMs are increasingly used to help with real-life goals or tasks that take a long time to complete, involving multiple sessions across days, weeks, months, or even years. Thus to enable conversational systems for long term interactions and tasks, we need language-based agents that can plan for long horizons. Traditionally, such capabilities were addressed by reinforcement learning agents with hierarchical planning capabilities. In this work, we explore a novel architecture where the LLM acts as the meta-controller deciding the agent's next macro-action, and tool use augmented LLM-based option policies execute the selected macro-action. We instantiate this framework for a specific set of macro-actions enabling adaptive planning for users' personal plans through conversation and follow-up questions collecting user feedback. We show how this paradigm can be applicable in scenarios ranging from tutoring for academic and non-academic tasks to conversational coaching for personal health plans.
