Table of Contents
Fetching ...

InCoder-32B-Thinking: Industrial Code World Model for Thinking

Jian Yang, Wei Zhang, Jiajun Wu, Junhang Cheng, Tuney Zheng, Fanglin Xu, Weicheng Gu, Lin Jing, Yaxin Du, Joseph Li, Yizhi Li, Yan Xing, Chuan Hao, Ran Tao, Ruihao Gong, Aishan Liu, Zhoujun Li, Mingjie Tang, Chenghua Lin, Siheng Chen, Wayne Xin Zhao, Xianglong Liu, Ming Zhou, Bryan Dai, Weifeng Lv

Abstract

Industrial software development across chip design, GPU optimization, and embedded systems lacks expert reasoning traces showing how engineers reason about hardware constraints and timing semantics. In this work, we propose InCoder-32B-Thinking, trained on the data from the Error-driven Chain-of-Thought (ECoT) synthesis framework with an industrial code world model (ICWM) to generate reasoning traces. Specifically, ECoT generates reasoning chains by synthesizing the thinking content from multi-turn dialogue with environmental error feedback, explicitly modeling the error-correction process. ICWM is trained on domain-specific execution traces from Verilog simulation, GPU profiling, etc., learns the causal dynamics of how code affects hardware behavior, and enables self-verification by predicting execution outcomes before actual compilation. All synthesized reasoning traces are validated through domain toolchains, creating training data matching the natural reasoning depth distribution of industrial tasks. Evaluation on 14 general (81.3% on LiveCodeBench v5) and 9 industrial benchmarks (84.0% in CAD-Coder and 38.0% on KernelBench) shows InCoder-32B-Thinking achieves top-tier open-source results across all domains.GPU Optimization

InCoder-32B-Thinking: Industrial Code World Model for Thinking

Abstract

Industrial software development across chip design, GPU optimization, and embedded systems lacks expert reasoning traces showing how engineers reason about hardware constraints and timing semantics. In this work, we propose InCoder-32B-Thinking, trained on the data from the Error-driven Chain-of-Thought (ECoT) synthesis framework with an industrial code world model (ICWM) to generate reasoning traces. Specifically, ECoT generates reasoning chains by synthesizing the thinking content from multi-turn dialogue with environmental error feedback, explicitly modeling the error-correction process. ICWM is trained on domain-specific execution traces from Verilog simulation, GPU profiling, etc., learns the causal dynamics of how code affects hardware behavior, and enables self-verification by predicting execution outcomes before actual compilation. All synthesized reasoning traces are validated through domain toolchains, creating training data matching the natural reasoning depth distribution of industrial tasks. Evaluation on 14 general (81.3% on LiveCodeBench v5) and 9 industrial benchmarks (84.0% in CAD-Coder and 38.0% on KernelBench) shows InCoder-32B-Thinking achieves top-tier open-source results across all domains.GPU Optimization

Paper Structure

This paper contains 51 sections, 3 equations, 7 figures, 5 tables.

Figures (7)

  • Figure 1: Overview of InCoder-32B-Thinking, a coder for general and industrial code intelligence with thinking capability, which supports tasks from general coding to industrial tasks such as GPU optimization, chip design, and 3D modeling.
  • Figure 2: The performance of InCoder-32B-Thinking on code benchmarks.
  • Figure 3: Comparison of CUDA kernel implementations for Hinge Loss. Through reasoning, InCoder-32B-Thinking identifies the shape mismatch between the 2D predictions and 1D targets tensors, correctly mapping flat indices to row indices for proper broadcasting. The non-thinking model indexes both tensors with the same flat index, causing out-of-bounds error.
  • Figure 4: Overview of the data engine pipeline. Left: task seeds and environment bundles are passed through an elicitation, execution, feedback loop against real backends, producing multi-turn trajectories $\mathcal{D}_{\text{real}}$. Right:$\mathcal{D}_{\text{real}}$ trains an ICWM that simulates the real backends during large-scale synthesis; periodic audits keep the world model calibrated across iterations.
  • Figure 5: ICWM fidelity across five industrial domains. Outcome prediction accuracy measures per-turn label agreement; trajectory agreement measures end-to-end verdict consistency over multi-turn correction sequences. The dashed line marks the 95% threshold. All domains exceed 95% on outcome accuracy; the 1.6--2.8 pp gap to trajectory agreement reflects error compounding across turns.
  • ...and 2 more figures