Table of Contents
Fetching ...

Towards High-Fidelity CAD Generation via LLM-Driven Program Generation and Text-Based B-Rep Primitive Grounding

Jiahao Li, Qingwang Zhang, Qiuyu Chen, Guozhan Qiu, Yunzhong Lou, Xiangdong Zhou

Abstract

The field of Computer-Aided Design (CAD) generation has made significant progress in recent years. Existing methods typically fall into two separate categorie: parametric CAD modeling and direct boundary representation (B-Rep) synthesis. In modern feature-based CAD systems, parametric modeling and B-Rep are inherently intertwined, as advanced parametric operations (e.g., fillet and chamfer) require explicit selection of B-Rep geometric primitives, and the B-Rep itself is derived from parametric operations. Consequently, this paradigm gap remains a critical factor limiting AI-driven CAD modeling for complex industrial product design. This paper present FutureCAD, a novel text-to-CAD framework that leverages large language models (LLMs) and a B-Rep grounding transformer (BRepGround) for high-fidelity CAD generation. Our method generates executable CadQuery scripts, and introduces a text-based query mechanism that enables the LLM to specify geometric selections via natural language, which BRepGround then grounds to the target primitives. To train our framework, we construct a new dataset comprising real-world CAD models. For the LLM, we apply supervised fine-tuning (SFT) to establish fundamental CAD generation capabilities, followed by reinforcement learning (RL) to improve generalization. Experiments show that FutureCAD achieves state-of-the-art CAD generation performance.

Towards High-Fidelity CAD Generation via LLM-Driven Program Generation and Text-Based B-Rep Primitive Grounding

Abstract

The field of Computer-Aided Design (CAD) generation has made significant progress in recent years. Existing methods typically fall into two separate categorie: parametric CAD modeling and direct boundary representation (B-Rep) synthesis. In modern feature-based CAD systems, parametric modeling and B-Rep are inherently intertwined, as advanced parametric operations (e.g., fillet and chamfer) require explicit selection of B-Rep geometric primitives, and the B-Rep itself is derived from parametric operations. Consequently, this paradigm gap remains a critical factor limiting AI-driven CAD modeling for complex industrial product design. This paper present FutureCAD, a novel text-to-CAD framework that leverages large language models (LLMs) and a B-Rep grounding transformer (BRepGround) for high-fidelity CAD generation. Our method generates executable CadQuery scripts, and introduces a text-based query mechanism that enables the LLM to specify geometric selections via natural language, which BRepGround then grounds to the target primitives. To train our framework, we construct a new dataset comprising real-world CAD models. For the LLM, we apply supervised fine-tuning (SFT) to establish fundamental CAD generation capabilities, followed by reinforcement learning (RL) to improve generalization. Experiments show that FutureCAD achieves state-of-the-art CAD generation performance.
Paper Structure (21 sections, 22 equations, 7 figures, 3 tables)

This paper contains 21 sections, 22 equations, 7 figures, 3 tables.

Figures (7)

  • Figure 1: Given a textual description, FutureCAD synergizes LLM-driven program generation and text-based B-Rep primitive grounding to support feature-based CAD modeling, enabling high-fidelity CAD model generation.
  • Figure 2: Overview of the FutureCAD framework.Left: An example CadQuery program with text-based queries (highlighted in orange) for B-Rep primitive selection. Top: The training pipeline includes BRepGround training for primitive grounding, and LLM training with supervised fine-tuning (SFT) followed by reinforcement learning (RL) with GSPO using Chamfer Distance-based rewards. Bottom: Illustration of the CAD modeling process: the LLM generates parametric features executed by the CAD kernel, and when operations require primitive references, the LLM produces textual queries for BRepGround to resolve, with the results enabling the kernel to proceed.
  • Figure 3: Architecture of BRepGround.(a) The B-Rep encoder extracts face and edge features and produces primitive embeddings via a GNN and adaptive layer. (b) The text encoder processes the query using pretrained BERT. (c) The fusion module fuses primitive and text embeddings through self-attention and cross-attention, followed by a classifier that predicts target primitives.
  • Figure 4: Qualitative comparison of text-to-CAD generation. Left: Results on the advanced subset, which includes models with advanced operations. Right: Results on the standard subset. Our method produces more accurate CAD models aligned with input descriptions.
  • Figure 5: Feature length distribution comparison between DeepCAD and our FutureCAD dataset.
  • ...and 2 more figures