Table of Contents
Fetching ...

PS-CAD: Local Geometry Guidance via Prompting and Selection for CAD Reconstruction

Bingchen Yang, Haiyong Jiang, Hao Pan, Peter Wonka, Jun Xiao, Guosheng Lin

TL;DR

This work tackles reverse engineering CAD models from point clouds by introducing PS-CAD, an iterative prompt-and-select framework that injects local geometric guidance into CAD sequence reconstruction. It leverages a differencing point cloud $p_{ref}$ and planar prompts from RANSAC to guide single-step sketches, extrusions, and Boolean operations, with a selection module to choose the most geometrically consistent candidate at each step. Using a CAD DSL based on SkexGen and a Point-MAE encoder, PS-CAD achieves substantial gains over state-of-the-art methods on DeepCAD and shows strong cross-domain performance on Fusion360, reducing geometry errors by about 10% and structural errors by about 15%. The approach enhances editability and robustness in reverse CAD reconstruction, though it currently focuses on sketch-extrusion workflows and leaves broader CAD operations for future extension.

Abstract

Reverse engineering CAD models from raw geometry is a classic but challenging research problem. In particular, reconstructing the CAD modeling sequence from point clouds provides great interpretability and convenience for editing. To improve upon this problem, we introduce geometric guidance into the reconstruction network. Our proposed model, PS-CAD, reconstructs the CAD modeling sequence one step at a time. At each step, we provide two forms of geometric guidance. First, we provide the geometry of surfaces where the current reconstruction differs from the complete model as a point cloud. This helps the framework to focus on regions that still need work. Second, we use geometric analysis to extract a set of planar prompts, that correspond to candidate surfaces where a CAD extrusion step could be started. Our framework has three major components. Geometric guidance computation extracts the two types of geometric guidance. Single-step reconstruction computes a single candidate CAD modeling step for each provided prompt. Single-step selection selects among the candidate CAD modeling steps. The process continues until the reconstruction is completed. Our quantitative results show a significant improvement across all metrics. For example, on the dataset DeepCAD, PS-CAD improves upon the best published SOTA method by reducing the geometry errors (CD and HD) by 10%, and the structural error (ECD metric) by about 15%.

PS-CAD: Local Geometry Guidance via Prompting and Selection for CAD Reconstruction

TL;DR

This work tackles reverse engineering CAD models from point clouds by introducing PS-CAD, an iterative prompt-and-select framework that injects local geometric guidance into CAD sequence reconstruction. It leverages a differencing point cloud and planar prompts from RANSAC to guide single-step sketches, extrusions, and Boolean operations, with a selection module to choose the most geometrically consistent candidate at each step. Using a CAD DSL based on SkexGen and a Point-MAE encoder, PS-CAD achieves substantial gains over state-of-the-art methods on DeepCAD and shows strong cross-domain performance on Fusion360, reducing geometry errors by about 10% and structural errors by about 15%. The approach enhances editability and robustness in reverse CAD reconstruction, though it currently focuses on sketch-extrusion workflows and leaves broader CAD operations for future extension.

Abstract

Reverse engineering CAD models from raw geometry is a classic but challenging research problem. In particular, reconstructing the CAD modeling sequence from point clouds provides great interpretability and convenience for editing. To improve upon this problem, we introduce geometric guidance into the reconstruction network. Our proposed model, PS-CAD, reconstructs the CAD modeling sequence one step at a time. At each step, we provide two forms of geometric guidance. First, we provide the geometry of surfaces where the current reconstruction differs from the complete model as a point cloud. This helps the framework to focus on regions that still need work. Second, we use geometric analysis to extract a set of planar prompts, that correspond to candidate surfaces where a CAD extrusion step could be started. Our framework has three major components. Geometric guidance computation extracts the two types of geometric guidance. Single-step reconstruction computes a single candidate CAD modeling step for each provided prompt. Single-step selection selects among the candidate CAD modeling steps. The process continues until the reconstruction is completed. Our quantitative results show a significant improvement across all metrics. For example, on the dataset DeepCAD, PS-CAD improves upon the best published SOTA method by reducing the geometry errors (CD and HD) by 10%, and the structural error (ECD metric) by about 15%.
Paper Structure (23 sections, 16 equations, 17 figures, 7 tables)

This paper contains 23 sections, 16 equations, 17 figures, 7 tables.

Figures (17)

  • Figure 1: Token probabilities for our method and HNC-CAD DBLP:conf/icml/XuJLWF23. Left: the input point cloud, the first CAD modeling step of the ground-truth, and the final output of the ground truth. Right: Probabilities of the ground truth tokens for sketches and extrusion operations for HNC-CAD and our method. Our method performs stably for each token while for HNC-CAD the probabilities drastically decrease after generating several tokens.
  • Figure 2: Comparison of different reconstruction pipelines: (a) feed-forward network (FFN), (b) autoregressive network (AR), and (c) our iterative reconstruction. In (a), the FFN decomposes an input point cloud into a fixed number of extrusion cylinders and their Boolean operations, each of which is represented with a CAD modeling step $(o_t^0, l_t^0)$. In (b), the AR reconstructs the CAD modeling sequence token-by-token. By sampling different tokens without specific geometric meaning, the AR can reconstruct multiple CAD modeling sequences and output the one with the lowest reconstruction error. In (c), we propose an iterative pipeline with each iterative step consisting of three components: geometric guidance computation, single-step reconstruction, and single-step selection. At step $t$, the geometric guidance computation generates candidate prompts $pr_t^i$ (in the form of sampled planes) and local geometric information $p_\text{ref}$ to provide geometric guidance to the single-step reconstruction module. The single-step reconstruction module produces one candidate CAD modeling step $(o_t^i, l_t^i)$. Then, the single-step sequence selection module selects the CAD modeling step with the highest scores and concatenates the corresponding tokens to the output sequence. This process iterates $n$ steps until a stop criterion is satisfied.
  • Figure 3: Illustration for the sketch and extrude representation. a) We show a 2D sketch consisting of four loops (three inner loops and one outer loop). The four loops define a face shaded in blue. Each of the loops consists of one or multiple curves, where each curve is a linear segment (defined by two points), an arc (defined by three points), or a circle (defined by four points). The outer loop is an example of a circle. This example shows a single face, but in general, a sketch with multiple faces is also allowed. b) The 2D sketch is transformed from its local 2D coordinate system to the 3D coordinate system of the CAD model by translation, orientation, and scaling. c) The 2D sketch is extruded from a height $d^-$ to a height $d^+$ defining an extrusion cylinder. We also use the bounding box of the extrusion cylinder in our computation.
  • Figure 4: An illustration of $p_\text{ref}$ for the first step in Fig. \ref{['fig:pip-overview']}, 3 detected planes from RANSAC, and their corresponding extrusion cylinders $o_0^* = (s^*_0, e_0^*)$. Only 3 detected planes are shown for simplicity.
  • Figure 5: Illustration of Geometric Guidance Computation and Single-Step Reconstruction. Geometric guidance computation takes the point cloud $p_\text{full}$ and $p_\text{prev}$ as input. It outputs the difference between $p_\text{full}$ and $p_\text{prev}$ as $p_\text{ref}$ and a list of prompts $\text{pr}_t^i$ detected from $p_\text{ref}$. The single-step reconstruction module consists of a sketch/extrude decoder and a Boolean operation decoder. The prompt $pr_t^i$, the point cloud $p_\text{ref}$, and a start token <SOS> are fed into the sketch-extrude decoder to autoregressively predict a sketch $s_t^i$ and an extrude operation $e_t^i$ defining the extrusion cylinder $o_t^i=(s_t^i, e_t^i)$. The Boolean operation decoder predicts $l_t^i$ based on the input $p_\text{full}$ and $p_\text{cur}^i$, where $p_\text{cur}^i$ is obtained by executing $o_t^i$. At last, $l_t^i$ and $o_t^i$ are combined as the output of single-step reconstruction.
  • ...and 12 more figures