An Empirical Study of Agent Developer Practices in AI Agent Frameworks
Yanlin Wang, Xinyi Xu, Jiachi Chen, Tingting Bi, Wenchao Gu, Zibin Zheng
TL;DR
The paper tackles the fragmentation and opaque impact of numerous LLM-based agent frameworks by conducting a large-scale, developer-centered empirical study. It compiles data from 1,575 agent repositories and 11,910 discussions across ten frameworks, mapping framework usage to the SDLC and evaluating frameworks along five developer-centered dimensions. Key contributions include a taxonomy of SDLC-based development challenges, a five-dimensional framework-performance evaluation, and actionable guidance for developers and framework designers on framework selection, composition, and long-term maintainability. The findings reveal four functional roles for frameworks, widespread multi-framework adoption, and notable gaps in learning cost, abstraction, performance, and maintainability, offering practical implications for building robust, scalable LLM-driven agent ecosystems.
Abstract
The rise of large language models (LLMs) has sparked a surge of interest in agents, leading to the rapid growth of agent frameworks. Agent frameworks are software toolkits and libraries that provide standardized components, abstractions, and orchestration mechanisms to simplify agent development. Despite widespread use of agent frameworks, their practical applications and how they influence the agent development process remain underexplored. Different agent frameworks encounter similar problems during use, indicating that these recurring issues deserve greater attention and call for further improvements in agent framework design. Meanwhile, as the number of agent frameworks continues to grow and evolve, more than 80% of developers report difficulties in identifying the frameworks that best meet their specific development requirements. In this paper, we conduct the first empirical study of LLM-based agent frameworks, exploring real-world experiences of developers in building AI agents. To compare how well the agent frameworks meet developer needs, we further collect developer discussions for the ten previously identified agent frameworks, resulting in a total of 11,910 discussions. Finally, by analyzing these discussions, we compare the frameworks across five dimensions: development efficiency, functional abstraction, learning cost, performance optimization, and maintainability, which refers to how easily developers can update and extend both the framework itself and the agents built upon it over time. Our comparative analysis reveals significant differences among frameworks in how they meet the needs of agent developers. Overall, we provide a set of findings and implications for the LLM-driven AI agent framework ecosystem and offer insights for the design of future LLM-based agent frameworks and agent developers.
