From Idea to CAD: A Language Model-Driven Multi-Agent System for Collaborative Design
Felix Ocker, Stefan Menzel, Ahmed Sadik, Thiago Rios
TL;DR
This work presents a Vision-Language Model–driven Multi-Agent System that mimics engineering team roles to automatically generate parametric CAD models from sketches or text. By coupling a RequirementsEngineer, CadEngineer, and QualityAssuranceEngineer within a V-model–inspired loop, the approach enables iterative specification, code generation with CadQuery, visual verification, and user-driven validation. Experimental results and ablations demonstrate higher design readiness compared to a single-shot baseline, while highlighting challenges in spatial reasoning and dependency on CAD tooling capabilities. The framework has practical implications for both industry professionals and hobbyists by lowering entry barriers and enabling collaborative, iterative CAD design with a self-feedback loop.
Abstract
Creating digital models using Computer Aided Design (CAD) is a process that requires in-depth expertise. In industrial product development, this process typically involves entire teams of engineers, spanning requirements engineering, CAD itself, and quality assurance. We present an approach that mirrors this team structure with a Vision Language Model (VLM)-based Multi Agent System, with access to parametric CAD tooling and tool documentation. Combining agents for requirements engineering, CAD engineering, and vision-based quality assurance, a model is generated automatically from sketches and/ or textual descriptions. The resulting model can be refined collaboratively in an iterative validation loop with the user. Our approach has the potential to increase the effectiveness of design processes, both for industry experts and for hobbyists who create models for 3D printing. We demonstrate the potential of the architecture at the example of various design tasks and provide several ablations that show the benefits of the architecture's individual components.
