EvoCAD: Evolutionary CAD Code Generation with Vision Language Models

Tobias Preintner; Weixuan Yuan; Adrian König; Thomas Bäck; Elena Raponi; Niki van Stein

EvoCAD: Evolutionary CAD Code Generation with Vision Language Models

Tobias Preintner, Weixuan Yuan, Adrian König, Thomas Bäck, Elena Raponi, Niki van Stein

TL;DR

EvoCAD addresses the challenge of generating CAD objects from natural prompts by combining large language models with evolutionary optimization and vision-language reasoning. The method initializes a diverse CADCode population, then iteratively evolves it by leveraging VLMs for visual description, RLMs for ranking, and LLM-driven crossover and mutation, guided by a topology-aware evaluation framework. It introduces two topology metrics based on the Euler characteristic to capture semantic similarity beyond geometric similarity, and demonstrates superior performance on the CADPrompt benchmark when deployed with GPT-4V and GPT-4o, especially in topology correctness. The work highlights practical gains for CAD design automation, while acknowledging current limitations in population size and generation depth, with future work aimed at scaling the optimization and extending evaluation to more complex data.

Abstract

Combining large language models with evolutionary computation algorithms represents a promising research direction leveraging the remarkable generative and in-context learning capabilities of LLMs with the strengths of evolutionary algorithms. In this work, we present EvoCAD, a method for generating computer-aided design (CAD) objects through their symbolic representations using vision language models and evolutionary optimization. Our method samples multiple CAD objects, which are then optimized using an evolutionary approach with vision language and reasoning language models. We assess our method using GPT-4V and GPT-4o, evaluating it on the CADPrompt benchmark dataset and comparing it to prior methods. Additionally, we introduce two new metrics based on topological properties defined by the Euler characteristic, which capture a form of semantic similarity between 3D objects. Our results demonstrate that EvoCAD outperforms previous approaches on multiple metrics, particularly in generating topologically correct objects, which can be efficiently evaluated using our two novel metrics that complement existing spatial metrics.

EvoCAD: Evolutionary CAD Code Generation with Vision Language Models

TL;DR

Abstract

EvoCAD: Evolutionary CAD Code Generation with Vision Language Models

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (3)