Table of Contents
Fetching ...

Design-Specification Tiling for ICL-based CAD Code Generation

Yali Du, San-Zhuo Xi, Hui Sun, Ming Li

Abstract

Large language models (LLMs) have demonstrated remarkable capabilities in code generation, yet they underperform on domain-specific tasks such as Computer-Aided Design (CAD) code generation due to scarce training data. In-Context Learning (ICL) offers a training-free alternative through task-specific exemplars. However, existing selection strategies prioritize similarity or point-wise diversity, often producing redundant selections that fail to satisfy the compositional requirements of complex CAD design specifications. In this work, we propose knowledge sufficiency as a principled objective for exemplar selection that aims to maximally satisfy all requirements within design specifications. To realize this objective, we introduce Design-Specification Tiling (DST), which quantifies knowledge sufficiency through a surrogate tiling ratio by extracting multi-granular design components and measuring the proportion of query components covered by selected exemplars. We demonstrate that maximizing this objective constitutes submodular maximization and provide a polynomial-time greedy algorithm with a (1-1/e)-approximation guarantee. Extensive experiments demonstrate that DST substantially improves CAD code generation quality, consistently outperforming existing exemplar selection strategies in ICL.

Design-Specification Tiling for ICL-based CAD Code Generation

Abstract

Large language models (LLMs) have demonstrated remarkable capabilities in code generation, yet they underperform on domain-specific tasks such as Computer-Aided Design (CAD) code generation due to scarce training data. In-Context Learning (ICL) offers a training-free alternative through task-specific exemplars. However, existing selection strategies prioritize similarity or point-wise diversity, often producing redundant selections that fail to satisfy the compositional requirements of complex CAD design specifications. In this work, we propose knowledge sufficiency as a principled objective for exemplar selection that aims to maximally satisfy all requirements within design specifications. To realize this objective, we introduce Design-Specification Tiling (DST), which quantifies knowledge sufficiency through a surrogate tiling ratio by extracting multi-granular design components and measuring the proportion of query components covered by selected exemplars. We demonstrate that maximizing this objective constitutes submodular maximization and provide a polynomial-time greedy algorithm with a (1-1/e)-approximation guarantee. Extensive experiments demonstrate that DST substantially improves CAD code generation quality, consistently outperforming existing exemplar selection strategies in ICL.
Paper Structure (39 sections, 2 theorems, 36 equations, 10 figures, 3 tables, 1 algorithm)

This paper contains 39 sections, 2 theorems, 36 equations, 10 figures, 3 tables, 1 algorithm.

Key Result

Proposition 1

The tiling objective $f(S) = |\mathcal{C}(S) \cap \mathcal{C}_{\text{query}}|$ satisfies the following properties:

Figures (10)

  • Figure 1: Compositional complexity in CAD design-specifications. The query (right) contains multiple design components (colored highlights), each requiring specific knowledge. Exemplars (left) each cover different component subsets. Effective selection should ensure collective coverage across all components.
  • Figure 2: Overview of the Design-Specification Tiling (DST) framework. Design specification components are extracted from the query and database using multi-granular sliding windows (left). The DST retrieval algorithm selects $k$ exemplars that maximize coverage of query components through greedy optimization (center). Selected exemplars are formatted into an ICL prompt template to guide the large language model in generating CAD code, which is then compiled and evaluated (right).
  • Figure 3: Performance trends of ICL strategies with increasing shot numbers based on DeepSeek-V3 on Hard tasks (Pareto front analysis)
  • Figure 4: Correlation between Tiling Ratio, Shot Count, and Performance on DeepSeek-V3 on Hard tasks across different ICL strategies
  • Figure 5: Exemplar Selection Quality Comparison Between DST and LDSIM for CAD Code Generation
  • ...and 5 more figures

Theorems & Definitions (4)

  • Definition 1: Submodularity nemhauser1978best
  • Proposition 1
  • Theorem 1: $(1-1/e)$-Approximation nemhauser1978best
  • proof