Table of Contents
Fetching ...

A fast cosine transformation accelerated method for predicting effective thermal conductivity

Changqing Ye, Shubin Fu, Eric T. Chung

TL;DR

This work addresses the computational bottleneck of predicting effective thermal conductivity (ETC) for heterogeneous materials by solving a PDE on high-resolution RVEs. It introduces a TPFA discretization with Dirichlet–Neumann mixed BCs and solves the resulting linear system via Preconditioned Conjugate Gradient (PCG) using an FFT-based preconditioner whose reference parameters are determined by a small linear programming problem. The preconditioner system is solved efficiently using multiple Fast Cosine Transforms (FCT) and parallel tridiagonal solvers, with a memory-efficient FCT implementation derived from FFTs to suit CUDA platforms. Numerical experiments on 3D RVEs up to 512^3 DoF demonstrate substantial GPU speedups (≈5×) over CPU and robust convergence across various heterogeneities, though performance degrades for very high contrasts. The approach offers a practical, hardware-friendly pathway for rapid ETC predictions applicable to materials design and thermal management.

Abstract

Predicting effective thermal conductivity by solving a Partial Differential Equation (PDE) defined on a high-resolution Representative Volume Element (RVE) is a computationally intensive task. In this paper, we tackle the task by proposing an efficient and implementation-friendly computational method that can fully leverage the computing power offered by hardware accelerators, namely, graphical processing units (GPUs). We first employ the Two-Point Flux-Approximation scheme to discretize the PDE and then utilize the preconditioned conjugate gradient method to solve the resulting algebraic linear system. The construction of the preconditioner originates from FFT-based homogenization methods, and an engineered linear programming technique is utilized to determine the homogeneous reference parameters. The fundamental observation presented in this paper is that the preconditioner system can be effectively solved using multiple Fast Cosine Transformations (FCT) and parallel tridiagonal matrix solvers. Regarding the fact that default multiple FCTs are unavailable on the CUDA platform, we detail how to derive FCTs from FFTs with nearly optimal memory usage. Numerical experiments including the stability comparison with standard preconditioners are conducted for 3D RVEs. Our performance reports indicate that the proposed method can achieve a $5$-fold acceleration on the GPU platform over the pure CPU platform and solve the problems with $512^3$ degrees of freedom and reasonable contrast ratios in less than $30$ seconds.

A fast cosine transformation accelerated method for predicting effective thermal conductivity

TL;DR

This work addresses the computational bottleneck of predicting effective thermal conductivity (ETC) for heterogeneous materials by solving a PDE on high-resolution RVEs. It introduces a TPFA discretization with Dirichlet–Neumann mixed BCs and solves the resulting linear system via Preconditioned Conjugate Gradient (PCG) using an FFT-based preconditioner whose reference parameters are determined by a small linear programming problem. The preconditioner system is solved efficiently using multiple Fast Cosine Transforms (FCT) and parallel tridiagonal solvers, with a memory-efficient FCT implementation derived from FFTs to suit CUDA platforms. Numerical experiments on 3D RVEs up to 512^3 DoF demonstrate substantial GPU speedups (≈5×) over CPU and robust convergence across various heterogeneities, though performance degrades for very high contrasts. The approach offers a practical, hardware-friendly pathway for rapid ETC predictions applicable to materials design and thermal management.

Abstract

Predicting effective thermal conductivity by solving a Partial Differential Equation (PDE) defined on a high-resolution Representative Volume Element (RVE) is a computationally intensive task. In this paper, we tackle the task by proposing an efficient and implementation-friendly computational method that can fully leverage the computing power offered by hardware accelerators, namely, graphical processing units (GPUs). We first employ the Two-Point Flux-Approximation scheme to discretize the PDE and then utilize the preconditioned conjugate gradient method to solve the resulting algebraic linear system. The construction of the preconditioner originates from FFT-based homogenization methods, and an engineered linear programming technique is utilized to determine the homogeneous reference parameters. The fundamental observation presented in this paper is that the preconditioner system can be effectively solved using multiple Fast Cosine Transformations (FCT) and parallel tridiagonal matrix solvers. Regarding the fact that default multiple FCTs are unavailable on the CUDA platform, we detail how to derive FCTs from FFTs with nearly optimal memory usage. Numerical experiments including the stability comparison with standard preconditioners are conducted for 3D RVEs. Our performance reports indicate that the proposed method can achieve a -fold acceleration on the GPU platform over the pure CPU platform and solve the problems with degrees of freedom and reasonable contrast ratios in less than seconds.
Paper Structure (14 sections, 2 theorems, 54 equations, 10 figures, 3 tables, 3 algorithms)

This paper contains 14 sections, 2 theorems, 54 equations, 10 figures, 3 tables, 3 algorithms.

Key Result

Lemma 3.1

If there exist constants $0< \Lambda'\leq \Lambda"$ such that for all $z_h$ and $q_h \in W_h$, then $\mathop{\mathrm{cond}}\nolimits(\mathtt{A}_\mathup{ref}^{-1}\mathtt{A})\leq \Lambda" / \Lambda'$.

Figures (10)

  • Figure 1: An illustration of a RVE and boundary parts $\Gamma_\mathup{in}$, $\Gamma_\mathup{out}$ and $\Gamma_\mathup{N}$.
  • Figure 2: An illustration of $\bm{v}_h$ and $W_h$.
  • Figure 3: Convergences of $\kappa^z_\mathup{eff}$ with respect to $\mathtt{dof}$ in the center-ball RVE configuration, where $\kappa^\mathup{inc} < 1$ in (a) and $\kappa^\mathup{inc} > 1$ in (b).
  • Figure 4: Convergence histories for different preconditioners in the center-ball RVE configuration, where in each plot, the x-axis represents the PCG iteration step and the y-axis represents the relative residual.
  • Figure 5: Three heterogeneous RVE configurations where balls (inclusions) with different sizes are randomly distributed, referred to as config-(a), config-(b), and config-(c) respectively.
  • ...and 5 more figures

Theorems & Definitions (4)

  • Lemma 3.1
  • Definition 3.2: 1D DCT
  • Proposition 3.3
  • proof