Table of Contents
Fetching ...

GPU-Accelerated Modified Bessel Function of the Second Kind for Gaussian Processes

Zipei Geng, Sameh Abdulah, Ying Sun, Hatem Ltaief, David E. Keyes, Marc G. Genton

TL;DR

This work tackles the bottleneck of computing the modified Bessel function of the second kind, $K_{\nu}(x)$, in GPU-accelerated Gaussian-process pipelines. It introduces a hybrid algorithm that combines Temme's small-$x$ series with a refined Takekawa integral approach, optimized for CUDA and integrated into ExaGeoStat for efficient Matérn covariance matrix generation. The method achieves high numerical accuracy—comparable to GSL and superior to prior GPU approaches—in a broad parameter range, and delivers substantial speedups (up to 12.62× with four A100 GPUs) over CPU-based implementations. The results demonstrate significant advancement for large-scale geospatial modeling and other GPU-reliant simulations, with practical impact on inference and prediction workflows in climate, environmental statistics, and physics.

Abstract

Modified Bessel functions of the second kind are widely used in physics, engineering, spatial statistics, and machine learning. Since contemporary scientific applications, including machine learning, rely on GPUs for acceleration, providing robust GPU-hosted implementations of special functions, such as the modified Bessel function, is crucial for performance. Existing implementations of the modified Bessel function of the second kind rely on CPUs and have limited coverage of the full range of values needed in some applications. In this work, we present a robust implementation of the modified Bessel function of the second kind on GPUs, eliminating the dependence on the CPU host. We cover a range of values commonly used in real applications, providing high accuracy compared to common libraries like the GNU Scientific Library (GSL) when referenced to Mathematica as the authority. Our GPU-accelerated approach also demonstrates a 2.68X performance improvement using a single A100 GPU compared to the GSL on 40-core Intel Cascade Lake CPUs. Our implementation is integrated into ExaGeoStat, the HPC framework for Gaussian process modeling, where the modified Bessel function of the second kind is required by the Matern covariance function in generating covariance matrices. We accelerate the matrix generation process in ExaGeoStat by up to 12.62X with four A100 GPUs while maintaining almost the same accuracy for modeling and prediction operations using synthetic and real datasets.

GPU-Accelerated Modified Bessel Function of the Second Kind for Gaussian Processes

TL;DR

This work tackles the bottleneck of computing the modified Bessel function of the second kind, , in GPU-accelerated Gaussian-process pipelines. It introduces a hybrid algorithm that combines Temme's small- series with a refined Takekawa integral approach, optimized for CUDA and integrated into ExaGeoStat for efficient Matérn covariance matrix generation. The method achieves high numerical accuracy—comparable to GSL and superior to prior GPU approaches—in a broad parameter range, and delivers substantial speedups (up to 12.62× with four A100 GPUs) over CPU-based implementations. The results demonstrate significant advancement for large-scale geospatial modeling and other GPU-reliant simulations, with practical impact on inference and prediction workflows in climate, environmental statistics, and physics.

Abstract

Modified Bessel functions of the second kind are widely used in physics, engineering, spatial statistics, and machine learning. Since contemporary scientific applications, including machine learning, rely on GPUs for acceleration, providing robust GPU-hosted implementations of special functions, such as the modified Bessel function, is crucial for performance. Existing implementations of the modified Bessel function of the second kind rely on CPUs and have limited coverage of the full range of values needed in some applications. In this work, we present a robust implementation of the modified Bessel function of the second kind on GPUs, eliminating the dependence on the CPU host. We cover a range of values commonly used in real applications, providing high accuracy compared to common libraries like the GNU Scientific Library (GSL) when referenced to Mathematica as the authority. Our GPU-accelerated approach also demonstrates a 2.68X performance improvement using a single A100 GPU compared to the GSL on 40-core Intel Cascade Lake CPUs. Our implementation is integrated into ExaGeoStat, the HPC framework for Gaussian process modeling, where the modified Bessel function of the second kind is required by the Matern covariance function in generating covariance matrices. We accelerate the matrix generation process in ExaGeoStat by up to 12.62X with four A100 GPUs while maintaining almost the same accuracy for modeling and prediction operations using synthetic and real datasets.

Paper Structure

This paper contains 18 sections, 18 equations, 12 figures, 1 table, 3 algorithms.

Figures (12)

  • Figure 1: Relative error of Takekawa's algorithm vs Mathematica for $(\nu, x) \in [0.001, 5] \times [0.001, 0.1]$.
  • Figure 2: LogBesselK accuracy comparisons using heatmap for $(\nu, x) \in [0.001, 20] \times [0.001, 140]$.
  • Figure 3: LogBesselK accuracy comparisons using heatmap for $(\nu, x) \in [0.001, 5] \times [0.001, 0.1]$.
  • Figure 4: Boxplots of MLE optimization results over 100 replicas comparing GSL (CPU) and refined algorithm (GPU). Left column: $\nu=0.5$ cases. Right column: $\nu=1$ cases. All plots show parameter estimates ($\sigma^2$, $\beta$, $\nu$) and iteration counts with red dashed lines indicating true values of parameters.
  • Figure 5: Boxplots of MLE optimization results over 100 replicas using GSL on CPU and the refined algorithm on GPU when $\nu=1$ and using $b=40$ bins.
  • ...and 7 more figures