Table of Contents
Fetching ...

GeoFusion-CAD: Structure-Aware Diffusion with Geometric State Space for Parametric 3D Design

Xiaolei Zhou, Chuangjie Fang, Jie Wu, Jingyi Yang, Boyi Lin, Jianwei Zheng

Abstract

Parametric Computer-Aided Design (CAD) is fundamental to modern 3D modeling, yet existing methods struggle to generate long command sequences, especially under complex geometric and topological dependencies. Transformer-based architectures dominate CAD sequence generation due to their strong dependency modeling, but their quadratic attention cost and limited context windowing hinder scalability to long programs. We propose GeoFusion-CAD, an end-to-end diffusion framework for scalable and structure-aware generation. Our proposal encodes CAD programs as hierarchical trees, jointly capturing geometry and topology within a state-space diffusion process. Specifically, a lightweight C-Mamba block models long-range structural dependencies through selective state transitions, enabling coherent generation across extended command sequences. To support long-sequence evaluation, we introduce DeepCAD-240, an extended benchmark that increases the sequence length ranging from 40 to 240 while preserving sketch-extrusion semantics from the ABC dataset. Extensive experiments demonstrate that GeoFusion-CAD achieves superior performance on both short and long command ranges, maintaining high geometric fidelity and topological consistency where Transformer-based models degrade. Our approach sets new state-of-the-art scores for long-sequence parametric CAD generation, establishing a scalable foundation for next-generation CAD modeling systems. Code and datasets are available at GitHub.

GeoFusion-CAD: Structure-Aware Diffusion with Geometric State Space for Parametric 3D Design

Abstract

Parametric Computer-Aided Design (CAD) is fundamental to modern 3D modeling, yet existing methods struggle to generate long command sequences, especially under complex geometric and topological dependencies. Transformer-based architectures dominate CAD sequence generation due to their strong dependency modeling, but their quadratic attention cost and limited context windowing hinder scalability to long programs. We propose GeoFusion-CAD, an end-to-end diffusion framework for scalable and structure-aware generation. Our proposal encodes CAD programs as hierarchical trees, jointly capturing geometry and topology within a state-space diffusion process. Specifically, a lightweight C-Mamba block models long-range structural dependencies through selective state transitions, enabling coherent generation across extended command sequences. To support long-sequence evaluation, we introduce DeepCAD-240, an extended benchmark that increases the sequence length ranging from 40 to 240 while preserving sketch-extrusion semantics from the ABC dataset. Extensive experiments demonstrate that GeoFusion-CAD achieves superior performance on both short and long command ranges, maintaining high geometric fidelity and topological consistency where Transformer-based models degrade. Our approach sets new state-of-the-art scores for long-sequence parametric CAD generation, establishing a scalable foundation for next-generation CAD modeling systems. Code and datasets are available at GitHub.
Paper Structure (56 sections, 32 equations, 9 figures, 6 tables, 1 algorithm)

This paper contains 56 sections, 32 equations, 9 figures, 6 tables, 1 algorithm.

Figures (9)

  • Figure 1: CAD construction pipeline and hierarchical representation. Sequential Sketch and Extrusion operations form a 3D solid, which is encoded by GeoFusion-CAD into a hierarchical tree capturing geometry and topology.
  • Figure 2: Comparison between transformer-based CAD architecture and ours. (Left) The transformer-based method trains command and argument sequences separately. (Right) Our GeoFusion-CAD models the hierarchical CAD topology with G-Mamba blocks, mapping solid and hierarchical topology into a unified sequence.
  • Figure 3: Overview of GeoFusion-CAD. The framework consists of two embedding layers, a G-Mamba diffusion encoder, and a CAD decoder. Each parent node conditions its child node during the reverse diffusion process, enabling structured and coherent generation.
  • Figure 4: Architecture of G-Mamba Blocks. The block consists of a DWC module, a GSM-SSD layer, and an MLP, enabling geometry-conditioned multi-scale state transitions.
  • Figure 5: Visual results on different test ranges. Compared with others, GeoFusion-CAD produces coherent solids with fewer surface artifacts. Some subtle abnormities are marked in red circles.
  • ...and 4 more figures