ActionStudio: A Lightweight Framework for Data and Training of Large Action Models

Jianguo Zhang; Thai Hoang; Ming Zhu; Zuxin Liu; Shiyu Wang; Tulika Awalgaonkar; Akshara Prabhakar; Haolin Chen; Weiran Yao; Zhiwei Liu; Juntao Tan; Juan Carlos Niebles; Shelby Heinecke; Huan Wang; Silvio Savarese; Caiming Xiong

ActionStudio: A Lightweight Framework for Data and Training of Large Action Models

Jianguo Zhang, Thai Hoang, Ming Zhu, Zuxin Liu, Shiyu Wang, Tulika Awalgaonkar, Akshara Prabhakar, Haolin Chen, Weiran Yao, Zhiwei Liu, Juntao Tan, Juan Carlos Niebles, Shelby Heinecke, Huan Wang, Silvio Savarese, Caiming Xiong

TL;DR

ActionStudio addresses fragmentation in agentic data processing and training by introducing Unified Format 2.0, a critique-and-filter data pipeline with real-time verification, and an end-to-end, open-source framework for scalable training of large action models. It also releases ActionStudio-98k, a high-quality trajectory dataset spanning 30,000 APIs and 300 domains, and demonstrates up to $9\times$ throughput gains over existing agent-centric frameworks while achieving state-of-the-art results on NexusRaven and the CRM Agent Benchmark. The framework supports flexible training regimes (LoRA, full fine-tuning) and strong multi-node scalability, enabling practical deployment in diverse industrial and research contexts. Overall, ActionStudio lowers barriers to developing robust, domain-general LAMs and provides a reproducible, extensible platform for future agentic AI research and applications.

Abstract

Large Action models are essential for enabling autonomous agents to perform complex tasks. However, training such models remains challenging due to the diversity of agent environments and the complexity of noisy agentic data. Existing infrastructure offers limited support for scalable, agent-specific fine-tuning and standardized agent data processing. We introduce ActionStudio, a lightweight and extensible data and training framework designed for large action models. ActionStudio unifies diverse agent trajectories using our proposed Unified Format 2.0, supports a range of training workflows with optimized multi-node distributed setup, and integrates robust preprocessing and real-time verification tools. ActionStudio demonstrates up to 9x higher throughput compared to existing agentic training frameworks, and our trained models yield top performances across public and realistic agent benchmarks. To support the broader research community, we open-source the ActionStudio framework and release actionstudio-98k, a curated dataset of 98k high-quality trajectories. Code: https://github.com/SalesforceAIResearch/xLAM.

ActionStudio: A Lightweight Framework for Data and Training of Large Action Models

TL;DR

Abstract

ActionStudio: A Lightweight Framework for Data and Training of Large Action Models

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (5)