NURBGen: High-Fidelity Text-to-CAD Generation through LLM-Driven NURBS Modeling
Muhammad Usama, Mohammad Sadil Khan, Didier Stricker, Muhammad Zeshan Afzal
TL;DR
NURBGen tackles the challenge of generating high-fidelity editable CAD models from natural language by casting text-to-CAD as a structured language-generation task. It fine-tunes an LLM to produce JSON-encoded NURBS surface parameters, enabling direct conversion to BRep geometry, and introduces partABC, a large-scale dataset of part-level CAD components with NURBS annotations and captions. A hybrid representation that combines untrimmed NURBS with analytic primitives addresses trimming artifacts while improving token efficiency. Experimental results show superior geometric fidelity, dimensional accuracy, and caption quality, highlighting the practical potential of NURBS-based text-to-CAD generation and providing a resource for future research.
Abstract
Generating editable 3D CAD models from natural language remains challenging, as existing text-to-CAD systems either produce meshes or rely on scarce design-history data. We present NURBGen, the first framework to generate high-fidelity 3D CAD models directly from text using Non-Uniform Rational B-Splines (NURBS). To achieve this, we fine-tune a large language model (LLM) to translate free-form texts into JSON representations containing NURBS surface parameters (\textit{i.e}, control points, knot vectors, degrees, and rational weights) which can be directly converted into BRep format using Python. We further propose a hybrid representation that combines untrimmed NURBS with analytic primitives to handle trimmed surfaces and degenerate regions more robustly, while reducing token complexity. Additionally, we introduce partABC, a curated subset of the ABC dataset consisting of individual CAD components, annotated with detailed captions using an automated annotation pipeline. NURBGen demonstrates strong performance on diverse prompts, surpassing prior methods in geometric fidelity and dimensional accuracy, as confirmed by expert evaluations. Code and dataset will be released publicly.
