DynaSaur: Large Language Agents Beyond Predefined Actions

Dang Nguyen; Viet Dac Lai; Seunghyun Yoon; Ryan A. Rossi; Handong Zhao; Ruiyi Zhang; Puneet Mathur; Nedim Lipka; Yu Wang; Trung Bui; Franck Dernoncourt; Tianyi Zhou

DynaSaur: Large Language Agents Beyond Predefined Actions

Dang Nguyen, Viet Dac Lai, Seunghyun Yoon, Ryan A. Rossi, Handong Zhao, Ruiyi Zhang, Puneet Mathur, Nedim Lipka, Yu Wang, Trung Bui, Franck Dernoncourt, Tianyi Zhou

TL;DR

DynaSaur reframes LLM agents to dynamically generate and compose actions as Python functions, overcoming the rigidity of fixed action sets. Actions are accumulated over time, enabling reuse and complex behavior through composition, and an action retrieval mechanism selects relevant generated functions. Empirical results across GAIA, MATH, TabMWP, AIME, and GPQA show substantial performance gains and robustness, with ablations confirming the contributions of action implementation, accumulation, and initial tooling. The framework maintains compatibility with human-designed tools and can incorporate external tools, highlighting practical impact for open-ended, real-world tasks while acknowledging safety considerations for code execution.

Abstract

Existing LLM agent systems typically select actions from a fixed and predefined set at every step. While this approach is effective in closed, narrowly scoped environments, it presents two major challenges for real-world, open-ended scenarios: (1) it significantly restricts the planning and acting capabilities of LLM agents, and (2) it requires substantial human effort to enumerate and implement all possible actions, which is impractical in complex environments with a vast number of potential actions. To address these limitations, we propose an LLM agent framework that can dynamically create and compose actions as needed. In this framework, the agent interacts with its environment by generating and executing programs written in a general-purpose programming language. Moreover, generated actions are accumulated over time for future reuse. Our extensive experiments across multiple benchmarks show that this framework significantly improves flexibility and outperforms prior methods that rely on a fixed action set. Notably, it enables LLM agents to adapt and recover in scenarios where predefined actions are insufficient or fail due to unforeseen edge cases. Our code can be found in https://github.com/adobe-research/dynasaur.

DynaSaur: Large Language Agents Beyond Predefined Actions

TL;DR

Abstract

DynaSaur: Large Language Agents Beyond Predefined Actions

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (10)