Table of Contents
Fetching ...

JumpStarter: Human-AI Planning with Task-Structured Context Curation

Xuanming Zhang, Sitong Wang, Jenny Ma, Alyssa Hwang, Zhou Yu, Lydia B. Chilton

TL;DR

JumpStarter tackles the challenge of planning with large language models by introducing task-structured context curation, which organizes complex goals into a hierarchical subtask tree and localizes context management at decision points. The approach combines context elicitation, selection, and reuse to generate actionable drafts more personalized and coherent than flat long-context prompting. Empirical results show a 16% improvement in plan quality over ablations and a 79% higher quality of plans compared with GPT-4o via ChatGPT, along with reduced task load in user studies. The work demonstrates that structured, task-centered context management enhances human-AI collaboration for goal-driven planning and points toward more transparent, modular, and context-aware LLM-based assistants in real-world workflows.

Abstract

Human-AI planning for complex goals remains challenging with current large language models (LLMs), which rely on linear chat histories and simplistic memory mechanisms. Despite advances in long-context prompting, users still manually manage information, leading to a high cognitive burden. Hence, we propose JumpStarter, a system that enables LLMs to collaborate with humans on complex goals by dynamically decomposing tasks to help users manage context. We specifically introduce task-structured context curation, a novel framework that breaks down a user's goal into a hierarchy of actionable subtasks, and scopes context to localized decision points, enabling finer-grained personalization and reuse. The framework is realized through three core mechanisms: context elicitation, selection, and reuse. We demonstrate that task-structured context curation significantly improves plan quality by 16% over ablations. Our user study shows that JumpStarter helped users generate plans with 79% higher quality compared to ChatGPT.

JumpStarter: Human-AI Planning with Task-Structured Context Curation

TL;DR

JumpStarter tackles the challenge of planning with large language models by introducing task-structured context curation, which organizes complex goals into a hierarchical subtask tree and localizes context management at decision points. The approach combines context elicitation, selection, and reuse to generate actionable drafts more personalized and coherent than flat long-context prompting. Empirical results show a 16% improvement in plan quality over ablations and a 79% higher quality of plans compared with GPT-4o via ChatGPT, along with reduced task load in user studies. The work demonstrates that structured, task-centered context management enhances human-AI collaboration for goal-driven planning and points toward more transparent, modular, and context-aware LLM-based assistants in real-world workflows.

Abstract

Human-AI planning for complex goals remains challenging with current large language models (LLMs), which rely on linear chat histories and simplistic memory mechanisms. Despite advances in long-context prompting, users still manually manage information, leading to a high cognitive burden. Hence, we propose JumpStarter, a system that enables LLMs to collaborate with humans on complex goals by dynamically decomposing tasks to help users manage context. We specifically introduce task-structured context curation, a novel framework that breaks down a user's goal into a hierarchy of actionable subtasks, and scopes context to localized decision points, enabling finer-grained personalization and reuse. The framework is realized through three core mechanisms: context elicitation, selection, and reuse. We demonstrate that task-structured context curation significantly improves plan quality by 16% over ablations. Our user study shows that JumpStarter helped users generate plans with 79% higher quality compared to ChatGPT.
Paper Structure (64 sections, 19 figures, 4 tables)

This paper contains 64 sections, 19 figures, 4 tables.

Figures (19)

  • Figure 1: JumpStarter helps users get started on their personal goals through task-structured context curation. It first takes the user's goal and elicits context for the goal. It then decomposes the goal into actionable subtasks. For each subtask, it helps users select relevant context and write answer drafts. It also aids users in refining these drafts by eliciting further context. Task-structured context curation improves plan quality over ablations. Our user study showed that JumpStarter helped users generate plans with 79% higher quality compared to using GPT-4o via the ChatGPT interface.
  • Figure 2: Expert evaluation of plan and answer draft quality across three conditions. Our task-structured context curation method (context elicitation, selection, and reuse) significantly outperforms both context reuse only and context selection and reuse only. Improvements are statistically significant (***p < 0.001, **p < 0.01).
  • Figure 3: User study results comparison between using ChatGPT and using JumpStarter. The statistical test results comparing JumpStarter with ChatGPT, where the p-values ($*$: $p<.050$, $**$: $p<.010$, $***$: $p<.001$) are reported.
  • Figure 4: A screenshot of JumpStarter creating plans and answer drafts for the goal Apply for a PhD in NLP. (A) Task breakdown is shown as a subtask tree, with the goal being the root node. Subtasks decomposed from the same parent node are shown on the same level. (B) Saving the answer draft. (C) Detailed descriptions of the selected subtask are shown. (D) The answer draft is generated, considering the specification from the user -- "I want schools in midwest of US". Users have three options to improve the draft: regenerate, add context and regenerate, and iterate based on users' new specifications.
  • Figure 5: JumpStarter generates questions to elicit context from users to clarify the goal. The user uploads his CV.
  • ...and 14 more figures