Table of Contents
Fetching ...

DeepOHeat-v1: Efficient Operator Learning for Fast and Trustworthy Thermal Simulation and Optimization in 3D-IC Design

Xinling Yu, Ziyue Liu, Hai Li, Yixing Li, Xin Ai, Zhiyu Zeng, Ian Young, Zheng Zhang

TL;DR

DeepOHeat-v1 advances operator learning for 3D-IC thermals by (1) replacing fixed-activation trunk nets with Kolmogorov–Arnold Networks to capture multi-scale patterns, (2) introducing separable training to dramatically cut physics-informed training cost, and (3) adding a confidence-based hybrid optimization that couples fast neural predictions with GMRES refinement for reliability. The approach yields accuracy comparable to high-fidelity finite-difference solvers while delivering substantial speedups in optimization workflows—up to 70.6x in floorplanning—and enables high-resolution analyses previously limited by memory. The combination of adaptive basis learning, scalable training, and trustworthiness assessment makes DeepOHeat-v1 a practical tool for thermal-aware design exploration in complex 3D-IC architectures. Open-source code is provided to facilitate adoption and further development.

Abstract

Thermal analysis is crucial in 3D-IC design due to increased power density and complex heat dissipation paths. Although operator learning frameworks such as DeepOHeat~\cite{liu2023deepoheat} have demonstrated promising preliminary results in accelerating thermal simulation, they face critical limitations in prediction capability for multi-scale thermal patterns, training efficiency, and trustworthiness of results during design optimization. This paper presents DeepOHeat-v1, an enhanced physics-informed operator learning framework that addresses these challenges through three key innovations. First, we integrate Kolmogorov-Arnold Networks with learnable activation functions as trunk networks, enabling an adaptive representation of multi-scale thermal patterns. This approach achieves a 1.25x and 6.29x reduction in error in two representative test cases. Second, we introduce a separable training method that decomposes the basis function along the coordinate axes, achieving 62x training speedup and 31x GPU memory reduction in our baseline case, and enabling thermal analysis at resolutions previously infeasible due to GPU memory constraints. Third, we propose a confidence score to evaluate the trustworthiness of the predicted results, and further develop a hybrid optimization workflow that combines operator learning with finite difference (FD) using Generalized Minimal Residual (GMRES) method for incremental solution refinement, enabling efficient and trustworthy thermal optimization. Experimental results demonstrate that DeepOHeat-v1 achieves accuracy comparable to optimization using high-fidelity finite difference solvers, while speeding up the entire optimization process by $70.6\times$ in our test cases, effectively minimizing the peak temperature through optimal placement of heat-generating components. Open source code is available at https://github.com/xlyu0127/DeepOHeat-v1.

DeepOHeat-v1: Efficient Operator Learning for Fast and Trustworthy Thermal Simulation and Optimization in 3D-IC Design

TL;DR

DeepOHeat-v1 advances operator learning for 3D-IC thermals by (1) replacing fixed-activation trunk nets with Kolmogorov–Arnold Networks to capture multi-scale patterns, (2) introducing separable training to dramatically cut physics-informed training cost, and (3) adding a confidence-based hybrid optimization that couples fast neural predictions with GMRES refinement for reliability. The approach yields accuracy comparable to high-fidelity finite-difference solvers while delivering substantial speedups in optimization workflows—up to 70.6x in floorplanning—and enables high-resolution analyses previously limited by memory. The combination of adaptive basis learning, scalable training, and trustworthiness assessment makes DeepOHeat-v1 a practical tool for thermal-aware design exploration in complex 3D-IC architectures. Open-source code is provided to facilitate adoption and further development.

Abstract

Thermal analysis is crucial in 3D-IC design due to increased power density and complex heat dissipation paths. Although operator learning frameworks such as DeepOHeat~\cite{liu2023deepoheat} have demonstrated promising preliminary results in accelerating thermal simulation, they face critical limitations in prediction capability for multi-scale thermal patterns, training efficiency, and trustworthiness of results during design optimization. This paper presents DeepOHeat-v1, an enhanced physics-informed operator learning framework that addresses these challenges through three key innovations. First, we integrate Kolmogorov-Arnold Networks with learnable activation functions as trunk networks, enabling an adaptive representation of multi-scale thermal patterns. This approach achieves a 1.25x and 6.29x reduction in error in two representative test cases. Second, we introduce a separable training method that decomposes the basis function along the coordinate axes, achieving 62x training speedup and 31x GPU memory reduction in our baseline case, and enabling thermal analysis at resolutions previously infeasible due to GPU memory constraints. Third, we propose a confidence score to evaluate the trustworthiness of the predicted results, and further develop a hybrid optimization workflow that combines operator learning with finite difference (FD) using Generalized Minimal Residual (GMRES) method for incremental solution refinement, enabling efficient and trustworthy thermal optimization. Experimental results demonstrate that DeepOHeat-v1 achieves accuracy comparable to optimization using high-fidelity finite difference solvers, while speeding up the entire optimization process by in our test cases, effectively minimizing the peak temperature through optimal placement of heat-generating components. Open source code is available at https://github.com/xlyu0127/DeepOHeat-v1.

Paper Structure

This paper contains 34 sections, 36 equations, 15 figures, 3 tables, 1 algorithm.

Figures (15)

  • Figure 1: Operator learning framework for thermal simulation. The theoretical operator $G$ provides a direct mapping from design configurations to temperature fields and is fundamentally defined by the PDE system. The objective of operator learning is to train a surrogate neural operator $G_{\boldsymbol{\theta}}$ that approximates $G$, enabling rapid temperature prediction without solving PDEs for each new design configuration.
  • Figure 2: DeepOHeat for thermal simulation. This example shows the case where $k=1$, with the surface power map as the design configuration. The architecture consists of branch and trunk networks that process the encoded power map $\mathcal{E}(\boldsymbol{u})$ and spatial coordinates respectively. Their outputs are combined through Hadamard product and summation to predict temperature, while physics-informed training with reverse-mode AD enables parameter updates without simulation data.
  • Figure 3: Comparison between (a) a traditional multilayer perceptron (MLP) and (b) a Kolmogorov–Arnold Network (KAN). In MLPs, the activation functions are fixed at the nodes, while weights are learnable. In contrast, KANs use learnable activation functions on edges and sum operations at nodes, offering greater flexibility in function representation.
  • Figure 4: Separable representation for thermal prediction in 3D-ICs. The model processes sets of points $\mathbf{y}_1^{(:)}$, $\mathbf{y}_2^{(:)}$, $\mathbf{y}_3^{(:)}$ along each spatial dimension through three independent trunk networks to compute basis function values $\tau_{n,k}^{(:)}$. The branch network generates coefficients $\beta_k$ that weight these basis functions, which are combined through outer products to efficiently represent the complete 3D temperature field as a sum of rank-1 tensors.
  • Figure 5: Hybrid thermal optimization framework for 3D-IC design. The framework consists of two nested loops: (1) an outer optimizer loop (left) that explores the design space using any optimization algorithm of choice; and (2) an inner trustworthy prediction loop (right) that ensures reliable thermal evaluation by applying DeepOHeat-v1 and adaptively refining predictions with GMRES when the residual exceeds a predefined threshold.
  • ...and 10 more figures