FourPhonon_GPU: A GPU-accelerated framework for calculating phonon scattering rates and thermal conductivity

Ziqi Guo; Xiulin Ruan; Guang Lin

FourPhonon_GPU: A GPU-accelerated framework for calculating phonon scattering rates and thermal conductivity

Ziqi Guo, Xiulin Ruan, Guang Lin

TL;DR

The paper tackles the high computational cost of fully resolving phonon scattering by four-phonon processes, which scales as $N^3$ for 3ph and $N^4$ for 4ph, hindering accurate thermal conductivity predictions. It introduces FourPhonon_GPU, a GPU-accelerated framework built on the FourPhonon package using OpenACC to realize a CPU–GPU heterogeneous workflow where the CPU enumerates scattering events and the GPU evaluates rates in parallel, preserving accuracy while dramatically reducing runtime. Key contributions include achieving over $25$-fold acceleration in the scattering-rate step and over $10$-fold total runtime speedup, a detailed comparison of GPU-offload versus CPU–GPU hybrid strategies, and benchmarking across GPU architectures (A100 > A30 > A10) on silicon as a test case, with explicit memory considerations for dense q-meshes. The work enables rigorous, first-principles phonon transport calculations at scale, offering a practical path toward accelerated materials discovery and outlining future directions such as iterative solvers and mixed-precision approaches.

Abstract

Accurately predicting phonon scattering is crucial for understanding thermal transport properties. However, the computational cost of such calculations, especially for four-phonon scattering, can often be more prohibitive when large number of phonon branches and scattering processes are involved. In this work, we present FourPhonon_GPU, a GPU-accelerated framework for three-phonon and four-phonon scattering rate calculations based on the FourPhonon package. By leveraging OpenACC and adopting a heterogeneous CPU-GPU computing strategy, we efficiently offload massive, parallelizable tasks to the GPU while using the CPU for process enumeration and control-heavy operations. Our approach achieves over 25x acceleration for the scattering rate computation step and over 10x total runtime speedup without sacrificing accuracy. Benchmarking on various GPU architectures confirms the method's scalability and highlights the importance of aligning parallelization strategies with hardware capabilities. This work provides an efficient and accurate computational tool for phonon transport modeling and opens pathways for accelerated materials discovery.

FourPhonon_GPU: A GPU-accelerated framework for calculating phonon scattering rates and thermal conductivity

TL;DR

The paper tackles the high computational cost of fully resolving phonon scattering by four-phonon processes, which scales as

for 3ph and

for 4ph, hindering accurate thermal conductivity predictions. It introduces FourPhonon_GPU, a GPU-accelerated framework built on the FourPhonon package using OpenACC to realize a CPU–GPU heterogeneous workflow where the CPU enumerates scattering events and the GPU evaluates rates in parallel, preserving accuracy while dramatically reducing runtime. Key contributions include achieving over

-fold acceleration in the scattering-rate step and over

-fold total runtime speedup, a detailed comparison of GPU-offload versus CPU–GPU hybrid strategies, and benchmarking across GPU architectures (A100 > A30 > A10) on silicon as a test case, with explicit memory considerations for dense q-meshes. The work enables rigorous, first-principles phonon transport calculations at scale, offering a practical path toward accelerated materials discovery and outlining future directions such as iterative solvers and mixed-precision approaches.

FourPhonon_GPU: A GPU-accelerated framework for calculating phonon scattering rates and thermal conductivity

TL;DR

Abstract

FourPhonon_GPU: A GPU-accelerated framework for calculating phonon scattering rates and thermal conductivity

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (7)