Table of Contents
Fetching ...

Seek-CAD: A Self-refined Generative Modeling for 3D Parametric CAD Using Local Inference via DeepSeek

Xueyang Li, Jiahao Li, Yu Song, Yunzhong Lou, Xiangdong Zhou

TL;DR

Seek-CAD tackles the problem of generating high-fidelity 3D parametric CAD models without fine-tuning large LLMs by leveraging a locally deployable open-source reasoning model (DeepSeek-R1) in a training-free framework. It introduces a retrieval-augmented generation pipeline and a novel SSR (Sketch, Sketch-based feature, Refinements) design paradigm, augmented with step-wise visual feedback and Chain-of-Thought alignment via Gemini-2.0 to iteratively refine CAD code. A CapType-based reference mechanism enables precise refinement of complex geometry, and a 40k-sample SSR CAD dataset supports practical industrial modeling needs. Experimental results show Seek-CAD achieves high geometric fidelity (CD/HD), accurate target descriptions (IoGT, G-Score), and meaningful diversity, highlighting the practicality of open-source, cost-efficient AI-assisted design workflows.

Abstract

The advent of Computer-Aided Design (CAD) generative modeling will significantly transform the design of industrial products. The recent research endeavor has extended into the realm of Large Language Models (LLMs). In contrast to fine-tuning methods, training-free approaches typically utilize the advanced closed-source LLMs, thereby offering enhanced flexibility and efficiency in the development of AI agents for generating CAD parametric models. However, the substantial cost and limitations of local deployment of the top-tier closed-source LLMs pose challenges in practical applications. The Seek-CAD is the pioneer exploration of locally deployed open-source inference LLM DeepSeek-R1 for CAD parametric model generation with a training-free methodology. This study is the first investigation to incorporate both visual and Chain-of-Thought (CoT) feedback within the self-refinement mechanism for generating CAD models. Specifically, the initial generated parametric CAD model is rendered into a sequence of step-wise perspective images, which are subsequently processed by a Vision Language Model (VLM) alongside the corresponding CoTs derived from DeepSeek-R1 to assess the CAD model generation. Then, the feedback is utilized by DeepSeek-R1 to refine the initial generated model for the next round of generation. Moreover, we present an innovative 3D CAD model dataset structured around the SSR (Sketch, Sketch-based feature, and Refinements) triple design paradigm. This dataset encompasses a wide range of CAD commands, thereby aligning effectively with industrial application requirements and proving suitable for the generation of LLMs. Extensive experiments validate the effectiveness of Seek-CAD under various metrics.

Seek-CAD: A Self-refined Generative Modeling for 3D Parametric CAD Using Local Inference via DeepSeek

TL;DR

Seek-CAD tackles the problem of generating high-fidelity 3D parametric CAD models without fine-tuning large LLMs by leveraging a locally deployable open-source reasoning model (DeepSeek-R1) in a training-free framework. It introduces a retrieval-augmented generation pipeline and a novel SSR (Sketch, Sketch-based feature, Refinements) design paradigm, augmented with step-wise visual feedback and Chain-of-Thought alignment via Gemini-2.0 to iteratively refine CAD code. A CapType-based reference mechanism enables precise refinement of complex geometry, and a 40k-sample SSR CAD dataset supports practical industrial modeling needs. Experimental results show Seek-CAD achieves high geometric fidelity (CD/HD), accurate target descriptions (IoGT, G-Score), and meaningful diversity, highlighting the practicality of open-source, cost-efficient AI-assisted design workflows.

Abstract

The advent of Computer-Aided Design (CAD) generative modeling will significantly transform the design of industrial products. The recent research endeavor has extended into the realm of Large Language Models (LLMs). In contrast to fine-tuning methods, training-free approaches typically utilize the advanced closed-source LLMs, thereby offering enhanced flexibility and efficiency in the development of AI agents for generating CAD parametric models. However, the substantial cost and limitations of local deployment of the top-tier closed-source LLMs pose challenges in practical applications. The Seek-CAD is the pioneer exploration of locally deployed open-source inference LLM DeepSeek-R1 for CAD parametric model generation with a training-free methodology. This study is the first investigation to incorporate both visual and Chain-of-Thought (CoT) feedback within the self-refinement mechanism for generating CAD models. Specifically, the initial generated parametric CAD model is rendered into a sequence of step-wise perspective images, which are subsequently processed by a Vision Language Model (VLM) alongside the corresponding CoTs derived from DeepSeek-R1 to assess the CAD model generation. Then, the feedback is utilized by DeepSeek-R1 to refine the initial generated model for the next round of generation. Moreover, we present an innovative 3D CAD model dataset structured around the SSR (Sketch, Sketch-based feature, and Refinements) triple design paradigm. This dataset encompasses a wide range of CAD commands, thereby aligning effectively with industrial application requirements and proving suitable for the generation of LLMs. Extensive experiments validate the effectiveness of Seek-CAD under various metrics.

Paper Structure

This paper contains 23 sections, 12 equations, 14 figures, 3 tables.

Figures (14)

  • Figure 1: The overview of our Seek-CAD framework. The whole pipeline can be divided into two parts consisting of "Initial CAD Code Generation" and "CAD Code Refinement", which are both embedded with a knowledge constraint depicted in Sec. \ref{['LIP']} to guide DeepSeek-R1 to generate CAD code following the SSR paradigm (Sec. \ref{['SSR']}). For the first part, a given query $T$ is enhanced by conducting RAG on a local CAD corpus that consisting 10, 000 CAD models. Next, Top-3 retrieved candidates would be concatenated with $T$ to trigger DeepSeek-R1 to generate an initial CAD code $I_0$. For the second part, $I_0$ would go through the Step-wise Visual Feedback with CoT to have the iteration refinement. To achieve this, we first utilize a rendering script $R(*)$ to obtain step-wise images of $I_0$, which can represents the intermediate and ultimate shape of the object ($M_I$, $M_U$) simultaneously. $\oplus$ denotes the concatenation of SSR triplets, where each triplet, represented by $S_i$ (Sec. \ref{['SSR']}), is rendered along with all its preceding triplets to preserve the correlations between object entities. (More details in Sec. \ref{['SVF']}). Next, the step-wise images are fed into Gemini-2.0 to assess their alignment with the CoT from DeepSeek-R1. This feedback determines whether the current code $I_k$ is reasonable. In practice, we set $k=2$ as the maximize iterations of code refinement.
  • Figure 2: The SSR Design Paradigm. Each CAD model is constructed as a sequence of SSR triplets, where each triplet consists of a sketch, a sketch-based feature (e.g., extrude, revolve), and optional refinement features (e.g., shell, chamfer, fillet). Topological primitives is traced using the CapType reference system (START, SWEPT, END) during modeling operations. Final shapes are formed by applying boolean operations (e.g., Union, Cut, Intersect) between the outputs of SSR triplets.
  • Figure 3: Illustration of the proposed CapType reference mechanism.
  • Figure 4: (a) Visual illustrations of CAD generative comparison. (b) The visualizations of refinement capability through the SVF strategy (Recall Sec \ref{['SVF']}). Please enlarge to 225% to see the text clearly.
  • Figure 5: Various Showcases by Seek-CAD. Please enlarge to 180% to see the text clearly.
  • ...and 9 more figures