MAO: A Framework for Process Model Generation with Multi-Agent Orchestration
Leilei Lin, Yumeng Jin, Yingming Zhou, Wenlong Chen, Chen Qian
TL;DR
MAO introduces a multi-agent orchestration framework that uses large language models to autonomously generate BPMN process diagrams from textual requirements. The workflow progresses through Generation, Refinement, Reviewing, and Testing to produce semantically correct and format-consistent BPMN texts, addressing semantic and format hallucinations via structured prompting and external tooling. Empirical results on FG-C and CG-O show MAO outperforms ProMoAI and human participants in quality and efficiency, with ablation analyses confirming the importance of Testing and Reviewing phases for complex process structures. This work demonstrates a practical, cost-effective approach to automated process modeling with robust error correction and potential for broader BPMN element support.
Abstract
Process models are frequently used in software engineering to describe business requirements, guide software testing and control system improvement. However, traditional process modeling methods often require the participation of numerous experts, which is expensive and time-consuming. Therefore, the exploration of a more efficient and cost-effective automated modeling method has emerged as a focal point in current research. This article explores a framework for automatically generating process models with multi-agent orchestration (MAO), aiming to enhance the efficiency of process modeling and offer valuable insights for domain experts. Our framework MAO leverages large language models as the cornerstone for multi-agent, employing an innovative prompt strategy to ensure efficient collaboration among multi-agent. Specifically, 1) generation. The first phase of MAO is to generate a slightly rough process model from the text description; 2) refinement. The agents would continuously refine the initial process model through multiple rounds of dialogue; 3) reviewing. Large language models are prone to hallucination phenomena among multi-turn dialogues, so the agents need to review and repair semantic hallucinations in process models; 4) testing. The representation of process models is diverse. Consequently, the agents utilize external tools to test whether the generated process model contains format errors, namely format hallucinations, and then adjust the process model to conform to the output paradigm. The experiments demonstrate that the process models generated by our framework outperform existing methods and surpass manual modeling by 89%, 61%, 52%, and 75% on four different datasets, respectively.
