Table of Contents
Fetching ...

EvoCAD: Evolutionary CAD Code Generation with Vision Language Models

Tobias Preintner, Weixuan Yuan, Adrian König, Thomas Bäck, Elena Raponi, Niki van Stein

TL;DR

EvoCAD addresses the challenge of generating CAD objects from natural prompts by combining large language models with evolutionary optimization and vision-language reasoning. The method initializes a diverse CADCode population, then iteratively evolves it by leveraging VLMs for visual description, RLMs for ranking, and LLM-driven crossover and mutation, guided by a topology-aware evaluation framework. It introduces two topology metrics based on the Euler characteristic to capture semantic similarity beyond geometric similarity, and demonstrates superior performance on the CADPrompt benchmark when deployed with GPT-4V and GPT-4o, especially in topology correctness. The work highlights practical gains for CAD design automation, while acknowledging current limitations in population size and generation depth, with future work aimed at scaling the optimization and extending evaluation to more complex data.

Abstract

Combining large language models with evolutionary computation algorithms represents a promising research direction leveraging the remarkable generative and in-context learning capabilities of LLMs with the strengths of evolutionary algorithms. In this work, we present EvoCAD, a method for generating computer-aided design (CAD) objects through their symbolic representations using vision language models and evolutionary optimization. Our method samples multiple CAD objects, which are then optimized using an evolutionary approach with vision language and reasoning language models. We assess our method using GPT-4V and GPT-4o, evaluating it on the CADPrompt benchmark dataset and comparing it to prior methods. Additionally, we introduce two new metrics based on topological properties defined by the Euler characteristic, which capture a form of semantic similarity between 3D objects. Our results demonstrate that EvoCAD outperforms previous approaches on multiple metrics, particularly in generating topologically correct objects, which can be efficiently evaluated using our two novel metrics that complement existing spatial metrics.

EvoCAD: Evolutionary CAD Code Generation with Vision Language Models

TL;DR

EvoCAD addresses the challenge of generating CAD objects from natural prompts by combining large language models with evolutionary optimization and vision-language reasoning. The method initializes a diverse CADCode population, then iteratively evolves it by leveraging VLMs for visual description, RLMs for ranking, and LLM-driven crossover and mutation, guided by a topology-aware evaluation framework. It introduces two topology metrics based on the Euler characteristic to capture semantic similarity beyond geometric similarity, and demonstrates superior performance on the CADPrompt benchmark when deployed with GPT-4V and GPT-4o, especially in topology correctness. The work highlights practical gains for CAD design automation, while acknowledging current limitations in population size and generation depth, with future work aimed at scaling the optimization and extending evaluation to more complex data.

Abstract

Combining large language models with evolutionary computation algorithms represents a promising research direction leveraging the remarkable generative and in-context learning capabilities of LLMs with the strengths of evolutionary algorithms. In this work, we present EvoCAD, a method for generating computer-aided design (CAD) objects through their symbolic representations using vision language models and evolutionary optimization. Our method samples multiple CAD objects, which are then optimized using an evolutionary approach with vision language and reasoning language models. We assess our method using GPT-4V and GPT-4o, evaluating it on the CADPrompt benchmark dataset and comparing it to prior methods. Additionally, we introduce two new metrics based on topological properties defined by the Euler characteristic, which capture a form of semantic similarity between 3D objects. Our results demonstrate that EvoCAD outperforms previous approaches on multiple metrics, particularly in generating topologically correct objects, which can be efficiently evaluated using our two novel metrics that complement existing spatial metrics.

Paper Structure

This paper contains 12 sections, 7 equations, 3 figures, 1 table, 1 algorithm.

Figures (3)

  • Figure 1: Visual illustration of our evolutionary CAD code generation method. Beginning with a textual user prompt describing the desired object, our approach initializes a population of CAD codes using a large language model, followed by evolutionary optimization with vision and reasoning language models.
  • Figure 2: Left: $T_{corr}$ defined as the ratio of samples where the Euler characteristic of the generated object matches that of the ground truth. Right: $T_{err}$ as average difference between the Euler characteristics of the generated object and the ground truth.
  • Figure 3: Qualitative comparison of our method with prior works. Given a prompt describing a CAD object, our method generates objects that more accurately follow the prompt, align with the ground truth, and exhibit greater topological correctness, as measured by the Euler characteristic $\chi$. Green denotes a $\chi$ value that matches the ground truth, while red indicates a deviation.