Table of Contents
Fetching ...

From Idea to CAD: A Language Model-Driven Multi-Agent System for Collaborative Design

Felix Ocker, Stefan Menzel, Ahmed Sadik, Thiago Rios

TL;DR

This work presents a Vision-Language Model–driven Multi-Agent System that mimics engineering team roles to automatically generate parametric CAD models from sketches or text. By coupling a RequirementsEngineer, CadEngineer, and QualityAssuranceEngineer within a V-model–inspired loop, the approach enables iterative specification, code generation with CadQuery, visual verification, and user-driven validation. Experimental results and ablations demonstrate higher design readiness compared to a single-shot baseline, while highlighting challenges in spatial reasoning and dependency on CAD tooling capabilities. The framework has practical implications for both industry professionals and hobbyists by lowering entry barriers and enabling collaborative, iterative CAD design with a self-feedback loop.

Abstract

Creating digital models using Computer Aided Design (CAD) is a process that requires in-depth expertise. In industrial product development, this process typically involves entire teams of engineers, spanning requirements engineering, CAD itself, and quality assurance. We present an approach that mirrors this team structure with a Vision Language Model (VLM)-based Multi Agent System, with access to parametric CAD tooling and tool documentation. Combining agents for requirements engineering, CAD engineering, and vision-based quality assurance, a model is generated automatically from sketches and/ or textual descriptions. The resulting model can be refined collaboratively in an iterative validation loop with the user. Our approach has the potential to increase the effectiveness of design processes, both for industry experts and for hobbyists who create models for 3D printing. We demonstrate the potential of the architecture at the example of various design tasks and provide several ablations that show the benefits of the architecture's individual components.

From Idea to CAD: A Language Model-Driven Multi-Agent System for Collaborative Design

TL;DR

This work presents a Vision-Language Model–driven Multi-Agent System that mimics engineering team roles to automatically generate parametric CAD models from sketches or text. By coupling a RequirementsEngineer, CadEngineer, and QualityAssuranceEngineer within a V-model–inspired loop, the approach enables iterative specification, code generation with CadQuery, visual verification, and user-driven validation. Experimental results and ablations demonstrate higher design readiness compared to a single-shot baseline, while highlighting challenges in spatial reasoning and dependency on CAD tooling capabilities. The framework has practical implications for both industry professionals and hobbyists by lowering entry barriers and enabling collaborative, iterative CAD design with a self-feedback loop.

Abstract

Creating digital models using Computer Aided Design (CAD) is a process that requires in-depth expertise. In industrial product development, this process typically involves entire teams of engineers, spanning requirements engineering, CAD itself, and quality assurance. We present an approach that mirrors this team structure with a Vision Language Model (VLM)-based Multi Agent System, with access to parametric CAD tooling and tool documentation. Combining agents for requirements engineering, CAD engineering, and vision-based quality assurance, a model is generated automatically from sketches and/ or textual descriptions. The resulting model can be refined collaboratively in an iterative validation loop with the user. Our approach has the potential to increase the effectiveness of design processes, both for industry experts and for hobbyists who create models for 3D printing. We demonstrate the potential of the architecture at the example of various design tasks and provide several ablations that show the benefits of the architecture's individual components.

Paper Structure

This paper contains 17 sections, 3 figures, 2 tables, 4 algorithms.

Figures (3)

  • Figure 1: The development phases our approach focuses on highlighted at the example of the V-model: resize=real*[height=9pt]1 requirement elicitation, resize=real*[height=9pt]2 model creation, resize=real*[height=9pt]3 verification, and resize=real*[height=9pt]4 validation.
  • Figure 2: Engineering team architecture.
  • Figure 3: Examples of the iterative design process from the user specification to the resulting model. The first column "Inputs" shows the visual inputs, the second column "0-shot " shows the baseline results without requirement clarification, verification and validation, and the right part shows the iterations with our . The dashed lines indicate validation steps. Red borders indicate wrong models, green borders indicate acceptable designs.