Table of Contents
Fetching ...

PAACE: A Plan-Aware Automated Agent Context Engineering Framework

Kamer Ali Yuksel

TL;DR

The paper tackles the context management bottleneck in long-horizon LLM agents by introducing PAACE, a plan-aware context engineering framework. It jointly models plan structure and next-k task relevance to perform rewriting, pruning, summarization, and compression of agent context, guided by outcome-preserving supervision. PAACE comprises PAACE-Syn, which generates large synthetic, plan-annotated workflows for training, and PAACE-FT, a distilled compressor that mimics the teacher's plan-aware compressions in a compact model. Across AppWorld, OfficeBench, and Multi-Objective QA, PAACE improves task correctness while substantially reducing context load and attention dependency; PAACE-FT retains 97–98% of the teacher’s performance while delivering over an order of magnitude cost savings, enabling practical deployment of plan-aware compression. These results establish plan-aware context shaping as a core, transferable component for robust long-horizon agent reasoning.

Abstract

Large Language Model (LLM) agents are increasingly deployed in complex, multi-step workflows involving planning, tool use, reflection, and interaction with external knowledge systems. These workflows generate rapidly expanding contexts that must be curated, transformed, and compressed to maintain fidelity, avoid attention dilution, and reduce inference cost. Prior work on summarization and query-aware compression largely ignores the multi-step, plan-aware nature of agentic reasoning. In this work, we introduce PAACE (Plan-Aware Automated Context Engineering), a unified framework for optimizing the evolving state of LLM agents through next-k-task relevance modeling, plan-structure analysis, instruction co-refinement, and function-preserving compression. PAACE comprises (1) PAACE-Syn, a large-scale generator of synthetic agent workflows annotated with stepwise compression supervision, and (2) PAACE-FT, a family of distilled, plan-aware compressors trained from successful teacher demonstrations. Experiments on long-horizon benchmarks (AppWorld, OfficeBench, and 8-Objective QA) demonstrate that PAACE consistently improves agent correctness while substantially reducing context load. On AppWorld, PAACE achieves higher accuracy than all baselines while lowering peak context and cumulative dependency. On OfficeBench and multi-hop QA, PAACE improves both accuracy and F1, achieving fewer steps, lower peak tokens, and reduced attention dependency. Distilled PAACE-FT retains 97 percent of the teacher's performance while reducing inference cost by over an order of magnitude, enabling practical deployment of plan-aware compression with compact models.

PAACE: A Plan-Aware Automated Agent Context Engineering Framework

TL;DR

The paper tackles the context management bottleneck in long-horizon LLM agents by introducing PAACE, a plan-aware context engineering framework. It jointly models plan structure and next-k task relevance to perform rewriting, pruning, summarization, and compression of agent context, guided by outcome-preserving supervision. PAACE comprises PAACE-Syn, which generates large synthetic, plan-annotated workflows for training, and PAACE-FT, a distilled compressor that mimics the teacher's plan-aware compressions in a compact model. Across AppWorld, OfficeBench, and Multi-Objective QA, PAACE improves task correctness while substantially reducing context load and attention dependency; PAACE-FT retains 97–98% of the teacher’s performance while delivering over an order of magnitude cost savings, enabling practical deployment of plan-aware compression. These results establish plan-aware context shaping as a core, transferable component for robust long-horizon agent reasoning.

Abstract

Large Language Model (LLM) agents are increasingly deployed in complex, multi-step workflows involving planning, tool use, reflection, and interaction with external knowledge systems. These workflows generate rapidly expanding contexts that must be curated, transformed, and compressed to maintain fidelity, avoid attention dilution, and reduce inference cost. Prior work on summarization and query-aware compression largely ignores the multi-step, plan-aware nature of agentic reasoning. In this work, we introduce PAACE (Plan-Aware Automated Context Engineering), a unified framework for optimizing the evolving state of LLM agents through next-k-task relevance modeling, plan-structure analysis, instruction co-refinement, and function-preserving compression. PAACE comprises (1) PAACE-Syn, a large-scale generator of synthetic agent workflows annotated with stepwise compression supervision, and (2) PAACE-FT, a family of distilled, plan-aware compressors trained from successful teacher demonstrations. Experiments on long-horizon benchmarks (AppWorld, OfficeBench, and 8-Objective QA) demonstrate that PAACE consistently improves agent correctness while substantially reducing context load. On AppWorld, PAACE achieves higher accuracy than all baselines while lowering peak context and cumulative dependency. On OfficeBench and multi-hop QA, PAACE improves both accuracy and F1, achieving fewer steps, lower peak tokens, and reduced attention dependency. Distilled PAACE-FT retains 97 percent of the teacher's performance while reducing inference cost by over an order of magnitude, enabling practical deployment of plan-aware compression with compact models.

Paper Structure

This paper contains 9 sections, 6 equations, 4 tables.