Table of Contents
Fetching ...

CADmium: Fine-Tuning Code Language Models for Text-Driven Sequential CAD Design

Prashant Govindarajan, Davide Baldelli, Jay Pathak, Quentin Fournier, Sarath Chandar

TL;DR

CADmium presents a text-to-text framework for CAD design by using GPT-4.1 to generate expert-level language annotations for a large CAD corpus and fine-tuning a code-language model (Qwen-2.5-Coder-14B) to produce JSON-based CAD histories from natural language prompts. The approach introduces novel geometric and topological metrics, including SD, DMCD, and EECM, to better capture 3D fidelity beyond traditional point-cloud or mesh-based scores. Experimental results show CADmium can automate CAD design with competitive or superior performance to Text2CAD across several tasks, scales, and datasets, while highlighting improved annotation fluency and robustness to text prompts. The work advances rapid, text-conditioned CAD generation and provides a valuable dataset, code, and models for the CAD and 3D design communities.

Abstract

Computer-aided design (CAD) is the digital construction of 2D and 3D objects, and is central to a wide range of engineering and manufacturing applications like automobile and aviation. Despite its importance, CAD modeling remains largely a time-intensive, manual task. Recent works have attempted to automate this process with small transformer-based models and handcrafted CAD sequence representations. However, there has been little effort to leverage the potential of large language models (LLMs) for sequential CAD design. In this work, we introduce a new large-scale dataset of more than 170k CAD models annotated with high-quality, human-like descriptions generated with our pipeline based on GPT-4.1. Using this dataset, we fine-tune powerful code-LLMs to generate CAD sequences represented in a JSON-based format from natural language descriptions, demonstrating the viability and effectiveness of this approach for text-conditioned CAD generation. Because simple metrics often fail to reflect the quality of generated objects, we introduce geometric and topological metrics based on sphericity, mean curvature, and Euler characteristic to provide richer structural insights. Our experiments and ablation studies on both synthetic and human-annotated data demonstrate that CADmium is able to automate CAD design, drastically speeding up the design of new objects. The dataset, code, and fine-tuned models are available online.

CADmium: Fine-Tuning Code Language Models for Text-Driven Sequential CAD Design

TL;DR

CADmium presents a text-to-text framework for CAD design by using GPT-4.1 to generate expert-level language annotations for a large CAD corpus and fine-tuning a code-language model (Qwen-2.5-Coder-14B) to produce JSON-based CAD histories from natural language prompts. The approach introduces novel geometric and topological metrics, including SD, DMCD, and EECM, to better capture 3D fidelity beyond traditional point-cloud or mesh-based scores. Experimental results show CADmium can automate CAD design with competitive or superior performance to Text2CAD across several tasks, scales, and datasets, while highlighting improved annotation fluency and robustness to text prompts. The work advances rapid, text-conditioned CAD generation and provides a valuable dataset, code, and models for the CAD and 3D design communities.

Abstract

Computer-aided design (CAD) is the digital construction of 2D and 3D objects, and is central to a wide range of engineering and manufacturing applications like automobile and aviation. Despite its importance, CAD modeling remains largely a time-intensive, manual task. Recent works have attempted to automate this process with small transformer-based models and handcrafted CAD sequence representations. However, there has been little effort to leverage the potential of large language models (LLMs) for sequential CAD design. In this work, we introduce a new large-scale dataset of more than 170k CAD models annotated with high-quality, human-like descriptions generated with our pipeline based on GPT-4.1. Using this dataset, we fine-tune powerful code-LLMs to generate CAD sequences represented in a JSON-based format from natural language descriptions, demonstrating the viability and effectiveness of this approach for text-conditioned CAD generation. Because simple metrics often fail to reflect the quality of generated objects, we introduce geometric and topological metrics based on sphericity, mean curvature, and Euler characteristic to provide richer structural insights. Our experiments and ablation studies on both synthetic and human-annotated data demonstrate that CADmium is able to automate CAD design, drastically speeding up the design of new objects. The dataset, code, and fine-tuned models are available online.

Paper Structure

This paper contains 19 sections, 4 equations, 23 figures, 8 tables.

Figures (23)

  • Figure 1: The CADmium pipeline for text-to-CAD generation. CADmium reformulates CAD generation as a purely text-to-text task. First, GPT-4.1 generates natural-sounding yet geometrically precise descriptions of 176,017 objects using their construction sequences in minimal JSON, and up to 10 multi-view images rendered with Blender. Then, the Qwen2.5-Coder LLM is fine-tuned with LoRA to translate these descriptions back into CAD sequences.
  • Figure 2: Comparative analysis of CADmium and Text2CAD expert-level annotations.(a) Vocabulary growth as a function of token count demonstrates that Text2CAD has a limited vocabulary compared to CADmium. (b) Human-likeness, clarity, visual faithfulness given image renders, and completeness against the minimal JSON as measured by Gemma-3 12B team2025gemma indicates that CADmium descriptions tend to produce more natural-sounding albeit challenging descriptions. (c-e) Distribution of word counts, unique words, and digit lengths within numerical expressions per annotation shows that CADmium descriptions are more concise and diverse.
  • Figure 3: Examples of generated objects from the CADmium test set: 3D models are generated by both Qwen2.5 Coder (fine-tuned) and Text2CAD (trained) on the CADmium training set. The examples are shown using CADmium’s expert prompts, with the natural-language descriptions shortened for readability. We present both successful and failed reconstructions from Qwen2.5 Coder and compare them against the outputs of Text2CAD.
  • Figure 4: Annotation comparison: CADmium (light purple) uses natural language, while Text2CAD (light blue) references JSON-like keys (e.g., 'face_1').
  • Figure 5: Geometric detail capture: CADmium (light purple) correctly identifies four holes, unlike Text2CAD (light blue) which describes two, highlighting improved visual understanding.
  • ...and 18 more figures