Designing LLM-based Multi-Agent Systems for Software Engineering Tasks: Quality Attributes, Design Patterns and Rationale
Yangxiao Cai, Ruiyin Li, Peng Liang, Mojtaba Shahin, Zengyang Li
TL;DR
This paper addresses the lack of systematic design guidance for LLM-based multi-agent systems in software engineering. By analyzing 94 papers, it maps which SE tasks are tackled, which quality attributes are prioritized (notably Functional Suitability and Functional Correctness), which design patterns are most common (especially Role-Based Cooperation and Self-Reflection), and what rationales guide these designs (primarily improving code quality). The findings provide actionable implications for building robust, maintainable, and efficient MASs across the software development life cycle, including human-in-the-loop and end-to-end approaches. The work advances design knowledge in LLM-based MASs and offers a framework for evaluating and improving SE-focused agent collaboration.
Abstract
As the complexity of Software Engineering (SE) tasks continues to escalate, Multi-Agent Systems (MASs) have emerged as a focal point of research and practice due to their autonomy and scalability. Furthermore, through leveraging the reasoning and planning capabilities of Large Language Models (LLMs), the application of LLM-based MASs in the field of SE is garnering increasing attention. However, there is no dedicated study that systematically explores the design of LLM-based MASs, including the Quality Attributes (QAs) on which designers mainly focus, the design patterns used by designers, and the rationale guiding the design of LLM-based MASs for SE tasks. To this end, we conducted a study to identify the QAs that LLM-based MASs for SE tasks focus on, the design patterns used in the MASs, and the design rationale for the MASs. We collected 94 papers on LLM-based MASs for SE tasks as the source. Our study shows that: (1) Code Generation is the most common SE task solved by LLM-based MASs among ten identified SE tasks, (2) Functional Suitability is the QA on which designers of LLM-based MASs pay the most attention, (3) Role-Based Cooperation is the design pattern most frequently employed among 16 patterns used to construct LLM-based MASs, and (4) Improving the Quality of Generated Code is the most common rationale behind the design of LLM-based MASs. Based on the study results, we presented the implications for the design of LLM-based MASs to support SE tasks.
