Table of Contents
Fetching ...

Pointer-CAD: Unifying B-Rep and Command Sequences via Pointer-based Edges & Faces Selection

Dacheng Qi, Chenyu Wang, Jingwei Xu, Tianzhe Chu, Zibo Zhao, Wen Liu, Wenrui Ding, Yi Ma, Shenghua Gao

TL;DR

Pointer-CAD is presented, a novel LLM-based CAD generation framework that leverages a pointer-based command sequence representation to explicitly incorporate the geometric information of B-rep models into sequential modeling, thereby significantly mitigating the topological inaccuracies introduced by quantization error.

Abstract

Constructing computer-aided design (CAD) models is labor-intensive but essential for engineering and manufacturing. Recent advances in Large Language Models (LLMs) have inspired the LLM-based CAD generation by representing CAD as command sequences. But these methods struggle in practical scenarios because command sequence representation does not support entity selection (e.g. faces or edges), limiting its ability to support complex editing operations such as chamfer or fillet. Further, the discretization of a continuous variable during sketch and extrude operations may result in topological errors. To address these limitations, we present Pointer-CAD, a novel LLM-based CAD generation framework that leverages a pointer-based command sequence representation to explicitly incorporate the geometric information of B-rep models into sequential modeling. In particular, Pointer-CAD decomposes CAD model generation into steps, conditioning the generation of each subsequent step on both the textual description and the B-rep generated from previous steps. Whenever an operation requires the selection of a specific geometric entity, the LLM predicts a Pointer that selects the most feature-consistent candidate from the available set. Such a selection operation also reduces the quantization error in the command sequence-based representation. To support the training of Pointer-CAD, we develop a data annotation pipeline that produces expert-level natural language descriptions and apply it to build a dataset of approximately 575K CAD models. Extensive experimental results demonstrate that Pointer-CAD effectively supports the generation of complex geometric structures and reduces segmentation error to an extremely low level, achieving a significant improvement over prior command sequence methods, thereby significantly mitigating the topological inaccuracies introduced by quantization error.

Pointer-CAD: Unifying B-Rep and Command Sequences via Pointer-based Edges & Faces Selection

TL;DR

Pointer-CAD is presented, a novel LLM-based CAD generation framework that leverages a pointer-based command sequence representation to explicitly incorporate the geometric information of B-rep models into sequential modeling, thereby significantly mitigating the topological inaccuracies introduced by quantization error.

Abstract

Constructing computer-aided design (CAD) models is labor-intensive but essential for engineering and manufacturing. Recent advances in Large Language Models (LLMs) have inspired the LLM-based CAD generation by representing CAD as command sequences. But these methods struggle in practical scenarios because command sequence representation does not support entity selection (e.g. faces or edges), limiting its ability to support complex editing operations such as chamfer or fillet. Further, the discretization of a continuous variable during sketch and extrude operations may result in topological errors. To address these limitations, we present Pointer-CAD, a novel LLM-based CAD generation framework that leverages a pointer-based command sequence representation to explicitly incorporate the geometric information of B-rep models into sequential modeling. In particular, Pointer-CAD decomposes CAD model generation into steps, conditioning the generation of each subsequent step on both the textual description and the B-rep generated from previous steps. Whenever an operation requires the selection of a specific geometric entity, the LLM predicts a Pointer that selects the most feature-consistent candidate from the available set. Such a selection operation also reduces the quantization error in the command sequence-based representation. To support the training of Pointer-CAD, we develop a data annotation pipeline that produces expert-level natural language descriptions and apply it to build a dataset of approximately 575K CAD models. Extensive experimental results demonstrate that Pointer-CAD effectively supports the generation of complex geometric structures and reduces segmentation error to an extremely low level, achieving a significant improvement over prior command sequence methods, thereby significantly mitigating the topological inaccuracies introduced by quantization error.
Paper Structure (46 sections, 6 equations, 18 figures, 14 tables)

This paper contains 46 sections, 6 equations, 18 figures, 14 tables.

Figures (18)

  • Figure 1: Illustration of the strength of our proposed pointer-based command sequence compared to the previous command sequence-based CAD representation. Command sequences suffer from the inability to refer to specific edges or faces, and discretization-induced quantization errors. In contrast, Pointer-CAD leverages edge pointers to directly refer to B-rep entities, enabling precise operations such as sketch snapping, thereby reducing quantization errors and faithfully following complex text instructions.
  • Figure 2: Pointer-CAD Pipeline. At each generation step, the full user prompt is tokenized, while the B-rep is updated with all geometry generated so far. A multimodal fusion module combines the textual prompt with the evolving B-rep, which is further encoded via a graph neural network over its faces and edges. The fused representation is then processed by a large language model to predict the vector for the current step, which is subsequently translated into geometry to update the B-rep.
  • Figure 3: Dataset construction pipeline. Raw JSONs are converted into a minimal format containing only annotation-relevant elements. Sketch planes and models are rendered, and Qwen2.5-VL generates textual descriptions for integration into the JSON. Finally, Qwen2.5 produces step-by-step instructions, with dimension parameters wrapped in special tags for future data augmentation.
  • Figure 4: Qualitative performance comparison on Recap-DeepCAD dataset. Our method consistently produces accurate and faithful geometry aligned with the ground truth, while competing methods often miss details or collapse entirely. Notably, Pointer-CAD achieves superior results among LLM-based methods despite a significantly smaller size than CADmium.
  • Figure 5: Qualitative performance comparison on Recap-OmniCAD+ dataset. Our method accurately recovers detailed structures that closely match the ground truth for complex CAD models involving chamfer or fillet operations. Conversely, competing methods often miss fine-grained features or fail entirely.
  • ...and 13 more figures