HPRMAT: A high-performance R-matrix solver with GPU acceleration for coupled-channel problems in nuclear physics

Jin Lei

HPRMAT: A high-performance R-matrix solver with GPU acceleration for coupled-channel problems in nuclear physics

Jin Lei

TL;DR

HPRMAT addresses the computational bottleneck in R-matrix coupled-channel scattering by replacing legacy inversion with direct LU solving across four backends, including GPU-accelerated and mixed-precision options. The library preserves compatibility with Descouvemont’s interface while achieving up to 9× CPU speedups and 18× over legacy codes for matrices of size up to $N = n_{\rm ch} \times n_{\rm lag} \approx 25600$, and maintains cross-section accuracy better than $10^{-5}$ relative. The mixed-precision approach leverages the FP32:FP64 throughput advantage on consumer GPUs to factorize in $FP32$ with iterative refinement to $FP64$ accuracy, broadening accessibility to desktop workstations. Validation against Descouvemont’s reference code across multiple test cases confirms both numerical reliability and physical fidelity, enabling large-scale CDCC and coupled-channel calculations without expensive data-center resources. $N$ indicates the total system dimension $N = n_{\rm ch} \times n_{\rm lag}$, and cross sections remain accurate within the required nuclear-physics tolerances.$

Abstract

I present HPRMAT, a high-performance solver library for the linear systems arising in R-matrix coupled-channel scattering calculations in nuclear physics. Designed as a drop-in replacement for the linear algebra routines in existing R-matrix codes, HPRMAT employs direct linear equation solving with optimized libraries instead of traditional matrix inversion, achieving significant performance improvements. The package provides four solver backends: (1) double-precision LU factorization, (2) mixed-precision arithmetic with iterative refinement, (3) a Woodbury formula approach exploiting the kinetic-coupling matrix structure, and (4) GPU acceleration. Benchmark calculations demonstrate that the GPU solver achieves up to 9$\times$ speedup over optimized CPU direct solvers, and 18$\times$ over legacy inversion-based codes, for large matrices ($N=25600$). The mixed-precision strategy is particularly effective on consumer GPUs (e.g., NVIDIA RTX 3090/4090), where single-precision throughput exceeds double-precision by a factor of 64:1; by performing factorization in single precision with iterative refinement, HPRMAT overcomes the poor FP64 performance of consumer hardware while maintaining double-precision accuracy. This makes large-scale CDCC and coupled-channel calculations accessible to researchers using standard desktop workstations, without requiring expensive data-center GPUs. CPU-only solvers provide 5--7$\times$ speedup through optimized libraries and algorithmic improvements. All solvers maintain physics accuracy with relative errors below $10^{-5}$ in cross-section calculations, validated against Descouvemont's reference code (Comput.\ Phys.\ Commun.\ 200, 199--219 (2016)). HPRMAT provides interfaces for Fortran, C, Python, and Julia.

HPRMAT: A high-performance R-matrix solver with GPU acceleration for coupled-channel problems in nuclear physics

TL;DR

, and maintains cross-section accuracy better than

relative. The mixed-precision approach leverages the FP32:FP64 throughput advantage on consumer GPUs to factorize in

with iterative refinement to

accuracy, broadening accessibility to desktop workstations. Validation against Descouvemont’s reference code across multiple test cases confirms both numerical reliability and physical fidelity, enabling large-scale CDCC and coupled-channel calculations without expensive data-center resources.

indicates the total system dimension

, and cross sections remain accurate within the required nuclear-physics tolerances.$

Abstract

speedup over optimized CPU direct solvers, and 18

over legacy inversion-based codes, for large matrices (

). The mixed-precision strategy is particularly effective on consumer GPUs (e.g., NVIDIA RTX 3090/4090), where single-precision throughput exceeds double-precision by a factor of 64:1; by performing factorization in single precision with iterative refinement, HPRMAT overcomes the poor FP64 performance of consumer hardware while maintaining double-precision accuracy. This makes large-scale CDCC and coupled-channel calculations accessible to researchers using standard desktop workstations, without requiring expensive data-center GPUs. CPU-only solvers provide 5--7

speedup through optimized libraries and algorithmic improvements. All solvers maintain physics accuracy with relative errors below

in cross-section calculations, validated against Descouvemont's reference code (Comput.\ Phys.\ Commun.\ 200, 199--219 (2016)). HPRMAT provides interfaces for Fortran, C, Python, and Julia.

HPRMAT: A high-performance R-matrix solver with GPU acceleration for coupled-channel problems in nuclear physics

TL;DR

Abstract

HPRMAT: A high-performance R-matrix solver with GPU acceleration for coupled-channel problems in nuclear physics

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (3)