GPU-Accelerated Modified Bessel Function of the Second Kind for Gaussian Processes
Zipei Geng, Sameh Abdulah, Ying Sun, Hatem Ltaief, David E. Keyes, Marc G. Genton
TL;DR
This work tackles the bottleneck of computing the modified Bessel function of the second kind, $K_{\nu}(x)$, in GPU-accelerated Gaussian-process pipelines. It introduces a hybrid algorithm that combines Temme's small-$x$ series with a refined Takekawa integral approach, optimized for CUDA and integrated into ExaGeoStat for efficient Matérn covariance matrix generation. The method achieves high numerical accuracy—comparable to GSL and superior to prior GPU approaches—in a broad parameter range, and delivers substantial speedups (up to 12.62× with four A100 GPUs) over CPU-based implementations. The results demonstrate significant advancement for large-scale geospatial modeling and other GPU-reliant simulations, with practical impact on inference and prediction workflows in climate, environmental statistics, and physics.
Abstract
Modified Bessel functions of the second kind are widely used in physics, engineering, spatial statistics, and machine learning. Since contemporary scientific applications, including machine learning, rely on GPUs for acceleration, providing robust GPU-hosted implementations of special functions, such as the modified Bessel function, is crucial for performance. Existing implementations of the modified Bessel function of the second kind rely on CPUs and have limited coverage of the full range of values needed in some applications. In this work, we present a robust implementation of the modified Bessel function of the second kind on GPUs, eliminating the dependence on the CPU host. We cover a range of values commonly used in real applications, providing high accuracy compared to common libraries like the GNU Scientific Library (GSL) when referenced to Mathematica as the authority. Our GPU-accelerated approach also demonstrates a 2.68X performance improvement using a single A100 GPU compared to the GSL on 40-core Intel Cascade Lake CPUs. Our implementation is integrated into ExaGeoStat, the HPC framework for Gaussian process modeling, where the modified Bessel function of the second kind is required by the Matern covariance function in generating covariance matrices. We accelerate the matrix generation process in ExaGeoStat by up to 12.62X with four A100 GPUs while maintaining almost the same accuracy for modeling and prediction operations using synthetic and real datasets.
