A fast cosine transformation accelerated method for predicting effective thermal conductivity
Changqing Ye, Shubin Fu, Eric T. Chung
TL;DR
This work addresses the computational bottleneck of predicting effective thermal conductivity (ETC) for heterogeneous materials by solving a PDE on high-resolution RVEs. It introduces a TPFA discretization with Dirichlet–Neumann mixed BCs and solves the resulting linear system via Preconditioned Conjugate Gradient (PCG) using an FFT-based preconditioner whose reference parameters are determined by a small linear programming problem. The preconditioner system is solved efficiently using multiple Fast Cosine Transforms (FCT) and parallel tridiagonal solvers, with a memory-efficient FCT implementation derived from FFTs to suit CUDA platforms. Numerical experiments on 3D RVEs up to 512^3 DoF demonstrate substantial GPU speedups (≈5×) over CPU and robust convergence across various heterogeneities, though performance degrades for very high contrasts. The approach offers a practical, hardware-friendly pathway for rapid ETC predictions applicable to materials design and thermal management.
Abstract
Predicting effective thermal conductivity by solving a Partial Differential Equation (PDE) defined on a high-resolution Representative Volume Element (RVE) is a computationally intensive task. In this paper, we tackle the task by proposing an efficient and implementation-friendly computational method that can fully leverage the computing power offered by hardware accelerators, namely, graphical processing units (GPUs). We first employ the Two-Point Flux-Approximation scheme to discretize the PDE and then utilize the preconditioned conjugate gradient method to solve the resulting algebraic linear system. The construction of the preconditioner originates from FFT-based homogenization methods, and an engineered linear programming technique is utilized to determine the homogeneous reference parameters. The fundamental observation presented in this paper is that the preconditioner system can be effectively solved using multiple Fast Cosine Transformations (FCT) and parallel tridiagonal matrix solvers. Regarding the fact that default multiple FCTs are unavailable on the CUDA platform, we detail how to derive FCTs from FFTs with nearly optimal memory usage. Numerical experiments including the stability comparison with standard preconditioners are conducted for 3D RVEs. Our performance reports indicate that the proposed method can achieve a $5$-fold acceleration on the GPU platform over the pure CPU platform and solve the problems with $512^3$ degrees of freedom and reasonable contrast ratios in less than $30$ seconds.
