PADME: Procedure Aware DynaMic Execution
Deepeka Garg, Sihan Zeng, Annapoorani L. Narayanan, Sumitra Ganesh, Leo Ardon
TL;DR
PADME addresses the challenge of executing long-horizon procedures from free-form text by introducing a graph-based decision-graph representation and a two-phase Teach/Execute framework. The Teach phase converts unstructured procedures into executable graphs with dependencies and decision points, while the Execute phase traverses these graphs in real time, adapting to context without losing structural coherence. This separation provides an inductive bias that reduces error accumulation and enables reuse across diverse domains, achieving state-of-the-art performance on four benchmarks including ALFWorld and ScienceWorld. The approach yields interpretable, reusable procedural plans and demonstrates robustness to ambiguity, offering practical impact for reliable, general-purpose procedural automation.
Abstract
Learning to autonomously execute long-horizon procedures from natural language remains a core challenge for intelligent agents. Free-form instructions such as recipes, scientific protocols, or business workflows encode rich procedural knowledge, but their variability and lack of structure cause agents driven by large language models (LLMs) to drift or fail during execution. We introduce Procedure Aware DynaMic Execution (PADME), an agent framework that produces and exploits a graph-based representation of procedures. Unlike prior work that relies on manual graph construction or unstructured reasoning, PADME autonomously transforms procedural text into executable graphs that capture task dependencies, decision points, and reusable subroutines. Central to PADME is a two-phase methodology; Teach phase, which focuses on systematic structuring, enrichment with executable logic of procedures, followed by Execute phase, which enables dynamic execution in response to real-time inputs and environment feedback. This separation ensures quality assurance and scalability, allowing expert knowledge to be encoded once and reliably reused across varying contexts. The graph representation also provides an inductive bias that reduces error accumulation in long-horizon reasoning, underscoring the importance of structured procedure modeling for reliable agent-driven automation. Empirically, PADME achieves state-of-the-art performance on four diverse benchmarks, including ALFWorld and ScienceWorld. These results demonstrate that agents equipped with graph-based procedure representations offer a powerful intermediate abstraction for robust and generalizable execution.
