NOMAD: A Multi-Agent LLM System for UML Class Diagram Generation from Natural Language Requirements
Polydoros Giannouris, Sophia Ananiadou
TL;DR
The paper tackles the challenge of generating accurate UML class diagrams from natural language requirements, proposing NOMAD, a cognitively inspired multi-agent LLM system that decomposes the task into specialised subtasks. NOMAD uses a four-actor generation pipeline plus an optional verifier to produce a structured model in JSON and PlantUML, paired with a formal error taxonomy and automated evaluation. Across a Northwind 2.0 case study and eight supplementary use cases, NOMAD consistently improves structural accuracy, especially in relationship reasoning, while highlighting attribute extraction as an enduring challenge; verification provides additional gains in completeness and consistency. The work offers a practical, interpretable, and extensible framework for reliable language-to-model workflows within Model-Driven Engineering, and sets the stage for broader multi-agent, verifiable modelling approaches.
Abstract
Large Language Models (LLMs) are increasingly utilised in software engineering, yet their ability to generate structured artefacts such as UML diagrams remains underexplored. In this work we present NOMAD, a cognitively inspired, modular multi-agent framework that decomposes UML generation into a series of role-specialised subtasks. Each agent handles a distinct modelling activity, such as entity extraction, relationship classification, and diagram synthesis, mirroring the goal-directed reasoning processes of an engineer. This decomposition improves interpretability and allows for targeted verification strategies. We evaluate NOMAD through a mixed design: a large case study (Northwind) for in-depth probing and error analysis, and human-authored UML exercises for breadth and realism. NOMAD outperforms all selected baselines, while revealing persistent challenges in fine-grained attribute extraction. Building on these observations, we introduce the first systematic taxonomy of errors in LLM-generated UML diagrams, categorising structural, relationship, and semantic/logical. Finally, we examine verification as a design probe, showing its mixed effects and outlining adaptive strategies as promising directions. Together, these contributions position NOMAD as both an effective framework for UML class diagram generation and a lens onto the broader research challenges of reliable language-to-model workflows.
