Table of Contents
Fetching ...

GaussianCAD: Robust Self-Supervised CAD Reconstruction from Three Orthographic Views Using 3D Gaussian Splatting

Zheng Zhou, Zhe Li, Bo Yu, Lina Hu, Liang Dong, Zijian Yang, Xiaoli Liu, Ning Xu, Ziwei Wang, Yonghao Dang, Jianqin Yin

TL;DR

GaussianCAD tackles 3D CAD reconstruction from three orthographic raster views by casting the problem as sparse-view reconstruction and employing 3D Gaussian Splatting in a self-supervised framework that eliminates dependence on 3D ground truth. The method integrates Sketch Augmentation, Camera Pose Localization, and Sparse-view CAD Reconstruction, leveraging Visual Hull initialization and a color/mask-based loss $L_{gs}$ to optimize the 3D Gaussians. Experiments on Sub-Fusion360 show state-of-the-art accuracy and strong robustness to noise, outperforming baselines and demonstrating industrial relevance. The work highlights a practical path toward robust, automatic CAD reconstruction from simple raster sketches in real-world design pipelines.

Abstract

The automatic reconstruction of 3D computer-aided design (CAD) models from CAD sketches has recently gained significant attention in the computer vision community. Most existing methods, however, rely on vector CAD sketches and 3D ground truth for supervision, which are often difficult to be obtained in industrial applications and are sensitive to noise inputs. We propose viewing CAD reconstruction as a specific instance of sparse-view 3D reconstruction to overcome these limitations. While this reformulation offers a promising perspective, existing 3D reconstruction methods typically require natural images and corresponding camera poses as inputs, which introduces two major significant challenges: (1) modality discrepancy between CAD sketches and natural images, and (2) difficulty of accurate camera pose estimation for CAD sketches. To solve these issues, we first transform the CAD sketches into representations resembling natural images and extract corresponding masks. Next, we manually calculate the camera poses for the orthographic views to ensure accurate alignment within the 3D coordinate system. Finally, we employ a customized sparse-view 3D reconstruction method to achieve high-quality reconstructions from aligned orthographic views. By leveraging raster CAD sketches for self-supervision, our approach eliminates the reliance on vector CAD sketches and 3D ground truth. Experiments on the Sub-Fusion360 dataset demonstrate that our proposed method significantly outperforms previous approaches in CAD reconstruction performance and exhibits strong robustness to noisy inputs.

GaussianCAD: Robust Self-Supervised CAD Reconstruction from Three Orthographic Views Using 3D Gaussian Splatting

TL;DR

GaussianCAD tackles 3D CAD reconstruction from three orthographic raster views by casting the problem as sparse-view reconstruction and employing 3D Gaussian Splatting in a self-supervised framework that eliminates dependence on 3D ground truth. The method integrates Sketch Augmentation, Camera Pose Localization, and Sparse-view CAD Reconstruction, leveraging Visual Hull initialization and a color/mask-based loss to optimize the 3D Gaussians. Experiments on Sub-Fusion360 show state-of-the-art accuracy and strong robustness to noise, outperforming baselines and demonstrating industrial relevance. The work highlights a practical path toward robust, automatic CAD reconstruction from simple raster sketches in real-world design pipelines.

Abstract

The automatic reconstruction of 3D computer-aided design (CAD) models from CAD sketches has recently gained significant attention in the computer vision community. Most existing methods, however, rely on vector CAD sketches and 3D ground truth for supervision, which are often difficult to be obtained in industrial applications and are sensitive to noise inputs. We propose viewing CAD reconstruction as a specific instance of sparse-view 3D reconstruction to overcome these limitations. While this reformulation offers a promising perspective, existing 3D reconstruction methods typically require natural images and corresponding camera poses as inputs, which introduces two major significant challenges: (1) modality discrepancy between CAD sketches and natural images, and (2) difficulty of accurate camera pose estimation for CAD sketches. To solve these issues, we first transform the CAD sketches into representations resembling natural images and extract corresponding masks. Next, we manually calculate the camera poses for the orthographic views to ensure accurate alignment within the 3D coordinate system. Finally, we employ a customized sparse-view 3D reconstruction method to achieve high-quality reconstructions from aligned orthographic views. By leveraging raster CAD sketches for self-supervision, our approach eliminates the reliance on vector CAD sketches and 3D ground truth. Experiments on the Sub-Fusion360 dataset demonstrate that our proposed method significantly outperforms previous approaches in CAD reconstruction performance and exhibits strong robustness to noisy inputs.

Paper Structure

This paper contains 19 sections, 11 equations, 10 figures, 1 table.

Figures (10)

  • Figure 1: Reconstruction results obtained with GaussianCAD. The framework is capable of reconstructing high-quality 3D CAD models from raster CAD sketches containing three orthographic views based on 3D Gaussian Splatting. GaussianCAD demonstrating superior performance over previous methods across diverse CAD sketches.
  • Figure 2: Overview of GaussianCAD. During Sketch Augmentation, a raster CAD sketch containing three orthographic views is provided as input. For each view, we filter out the noise and remove dashed lines, then extract the mask and colorize the corresponding foreground, ultimately obtaining a natural image–like CAD sketch. In the Camera Pose Localization stage, we determine the camera poses for the three views based on their relative positional priors to achieve precise alignment in 3D space and set uniform intrinsic camera parameters. During Sparse-view CAD Reconstruction, we initialize the 3D Gaussians by constructing the visual hull using the masked views and corresponding camera parameters, which are optimized with $\mathcal{L}_{gs}$. Finally, we obtain a 3D CAD model that matches the input multi-view sketch.
  • Figure 3: Visualization highlighting the quality differences between raster and vector CAD sketches. The raster sketch (left) exhibits significantly more noise than the vector sketch (right).
  • Figure 4: Qualitative examples from the Sub-Fusion360 dataset. GaussianObject fails to reconstruct any 3D CAD model, producing sparse artifacts and fragmented clusters. Photo2CAD reconstructs some 3D models, but they do not match the input CAD sketches, resulting in inaccurate outcomes.
  • Figure 5: Robustness verification of our method. The results demonstrate that our approach remains robust under noisy input conditions, while the traditional method (OrthoRec) fails to reconstruct accurately even with minimal noise. Red circles highlight the missing faces.
  • ...and 5 more figures