Table of Contents
Fetching ...

Image2CADSeq: Computer-Aided Design Sequence and Knowledge Inference from Product Images

Xingang Li, Zhenghui Sha

TL;DR

This paper introduces Image2CADSeq, a data-driven approach to predict sequences of CAD operations from a single image using a target-embedding variational autoencoder (TEVAE). It leverages a Fusion 360 Gallery domain-specific language to define and vectorize CAD programs, and a two-stage training pipeline (Stage 1: CAD-sequence latent learning; Stage 2: image-to-latent regression) aided by a synthetic data generation pipeline. A comprehensive, multi-level evaluation framework assesses CAD programs, 3D models, and images, showing that incorporating design rules and TEVAE yields the best performance, with notable improvements over baseline AE and real-world parsing challenges. The work demonstrates potential for accelerating CAD reconstruction, enabling design knowledge capture and democratization, while also outlining routes to handle more complex geometries and real-world data.

Abstract

Computer-aided design (CAD) tools empower designers to design and modify 3D models through a series of CAD operations, commonly referred to as a CAD sequence. In scenarios where digital CAD files are not accessible, reverse engineering (RE) has been used to reconstruct 3D CAD models. Recent advances have seen the rise of data-driven approaches for RE, with a primary focus on converting 3D data, such as point clouds, into 3D models in boundary representation (B-rep) format. However, obtaining 3D data poses significant challenges, and B-rep models do not reveal knowledge about the 3D modeling process of designs. To this end, our research introduces a novel data-driven approach with an Image2CADSeq neural network model. This model aims to reverse engineer CAD models by processing images as input and generating CAD sequences. These sequences can then be translated into B-rep models using a solid modeling kernel. Unlike B-rep models, CAD sequences offer enhanced flexibility to modify individual steps of model creation, providing a deeper understanding of the construction process of CAD models. To quantitatively and rigorously evaluate the predictive performance of the Image2CADSeq model, we have developed a multi-level evaluation framework for model assessment. The model was trained on a specially synthesized dataset, and various network architectures were explored to optimize the performance. The experimental and validation results show great potential for the model in generating CAD sequences from 2D image data.

Image2CADSeq: Computer-Aided Design Sequence and Knowledge Inference from Product Images

TL;DR

This paper introduces Image2CADSeq, a data-driven approach to predict sequences of CAD operations from a single image using a target-embedding variational autoencoder (TEVAE). It leverages a Fusion 360 Gallery domain-specific language to define and vectorize CAD programs, and a two-stage training pipeline (Stage 1: CAD-sequence latent learning; Stage 2: image-to-latent regression) aided by a synthetic data generation pipeline. A comprehensive, multi-level evaluation framework assesses CAD programs, 3D models, and images, showing that incorporating design rules and TEVAE yields the best performance, with notable improvements over baseline AE and real-world parsing challenges. The work demonstrates potential for accelerating CAD reconstruction, enabling design knowledge capture and democratization, while also outlining routes to handle more complex geometries and real-world data.

Abstract

Computer-aided design (CAD) tools empower designers to design and modify 3D models through a series of CAD operations, commonly referred to as a CAD sequence. In scenarios where digital CAD files are not accessible, reverse engineering (RE) has been used to reconstruct 3D CAD models. Recent advances have seen the rise of data-driven approaches for RE, with a primary focus on converting 3D data, such as point clouds, into 3D models in boundary representation (B-rep) format. However, obtaining 3D data poses significant challenges, and B-rep models do not reveal knowledge about the 3D modeling process of designs. To this end, our research introduces a novel data-driven approach with an Image2CADSeq neural network model. This model aims to reverse engineer CAD models by processing images as input and generating CAD sequences. These sequences can then be translated into B-rep models using a solid modeling kernel. Unlike B-rep models, CAD sequences offer enhanced flexibility to modify individual steps of model creation, providing a deeper understanding of the construction process of CAD models. To quantitatively and rigorously evaluate the predictive performance of the Image2CADSeq model, we have developed a multi-level evaluation framework for model assessment. The model was trained on a specially synthesized dataset, and various network architectures were explored to optimize the performance. The experimental and validation results show great potential for the model in generating CAD sequences from 2D image data.
Paper Structure (24 sections, 9 equations, 10 figures, 6 tables)

This paper contains 24 sections, 9 equations, 10 figures, 6 tables.

Figures (10)

  • Figure 1: Approach overview.
  • Figure 2: Image2CADSeq model using a target-embedding representation learning method
  • Figure 3: Synthesis pipeline for the training dataset of data pairs of image and vectorized CAD sequence, exemplified using a cylinder model
  • Figure 4: Implementation of the Image2CADSeq model. Two different encoder-decoder architectures were explored in Stage 1: (1) Baseline model: a transformer-based autoencoder (AE), adapted from DeepCADNet wu2021deepcad, and (2) Enhanced model: a transformer-based variational autoencoder (VAE), which extends the AE architecture. In Stage 2, the encoder is developed based on ResNet18 he2016deep, employing a dropout layer before the final layer to mitigate overfitting and enhance generalization.
  • Figure 5: Evaluation of the Image2CADSeq model's performance using two distinct architectures with two datasets. (a) Case 1: Results from the network utilizing the TEA architecture trained on the dataset without rules. (b) Case 2: Results from the same TEA architecture but trained on the dataset with rules. (c) Case 3: Results using the TEVAE architecture trained with the dataset with rules. Each figure illustrates the variation of the network's performance metrics (shown in Table \ref{['tab:metrics']}) versus the first $n$ CAD operations in a CAD program. Specifically, a tolerance $\eta=3$ is chosen for metrics that involve the calculation of the accuracy of parameters, such as ACP and AP$^1$.
  • ...and 5 more figures