Accurate Computation of the Logarithm of Modified Bessel Functions on GPUs

Andreas Plesner; Hans Henrik Brandenborg Sørensen; Søren Hauberg

Accurate Computation of the Logarithm of Modified Bessel Functions on GPUs

Andreas Plesner, Hans Henrik Brandenborg Sørensen, Søren Hauberg

TL;DR

This work tackles the instability and insufficient precision of existing routines for computing the logarithms of modified Bessel functions $\log I_v(x)$ and $\log K_v(x)$, especially at large orders and on GPUs. It introduces two novel, log-domain algorithms (including $\mu_K$ and $U_K$ asymptotics) and an integral-based approach, combining them via input-region aware selection to achieve machine-precision accuracy with robust underflow/overflow handling. The authors demonstrate dramatic speedups on GPUs and CPUs compared with SciPy and other libraries, and validate the method on a high-dimensional vMF fitting task using CIFAR-10 features, illustrating practical impact in areas like uncertainty quantification in metric learning. The work provides a fast, robust, open-source CUDA/C++ implementation suitable for GPU-accelerated scientific computing with direct relevance to statistical models relying on $I_v$ and $K_v$.

Abstract

Bessel functions are critical in scientific computing for applications such as machine learning, protein structure modeling, and robotics. However, currently, available routines lack precision or fail for certain input ranges, such as when the order $v$ is large, and GPU-specific implementations are limited. We address the precision limitations of current numerical implementations while dramatically improving the runtime. We propose two novel algorithms for computing the logarithm of modified Bessel functions of the first and second kinds by computing intermediate values on a logarithmic scale. Our algorithms are robust and never have issues with underflows or overflows while having relative errors on the order of machine precision, even for inputs where existing libraries fail. In C++/CUDA, our algorithms have median and maximum speedups of 45x and 6150x for GPU and 17x and 3403x for CPU, respectively, over the ranges of inputs and third-party libraries tested. Compared to SciPy, the algorithms have median and maximum speedups of 77x and 300x for GPU and 35x and 98x for CPU, respectively, over the tested inputs. The ability to robustly compute a solution and the low relative errors allow us to fit von Mises-Fisher, vMF, distributions to high-dimensional neural network features. This is, e.g., relevant for uncertainty quantification in metric learning. We obtain image feature data by processing CIFAR10 training images with the convolutional layers of a pre-trained ResNet50. We successfully fit vMF distributions to 2048-, 8192-, and 32768-dimensional image feature data using our algorithms. Our approach provides fast and accurate results while existing implementations in SciPy and mpmath fail to fit successfully. Our approach is readily implementable on GPUs, and we provide a fast open-source implementation alongside this paper.

Accurate Computation of the Logarithm of Modified Bessel Functions on GPUs

TL;DR

This work tackles the instability and insufficient precision of existing routines for computing the logarithms of modified Bessel functions

and

, especially at large orders and on GPUs. It introduces two novel, log-domain algorithms (including

and

asymptotics) and an integral-based approach, combining them via input-region aware selection to achieve machine-precision accuracy with robust underflow/overflow handling. The authors demonstrate dramatic speedups on GPUs and CPUs compared with SciPy and other libraries, and validate the method on a high-dimensional vMF fitting task using CIFAR-10 features, illustrating practical impact in areas like uncertainty quantification in metric learning. The work provides a fast, robust, open-source CUDA/C++ implementation suitable for GPU-accelerated scientific computing with direct relevance to statistical models relying on

and

Abstract

is large, and GPU-specific implementations are limited. We address the precision limitations of current numerical implementations while dramatically improving the runtime. We propose two novel algorithms for computing the logarithm of modified Bessel functions of the first and second kinds by computing intermediate values on a logarithmic scale. Our algorithms are robust and never have issues with underflows or overflows while having relative errors on the order of machine precision, even for inputs where existing libraries fail. In C++/CUDA, our algorithms have median and maximum speedups of 45x and 6150x for GPU and 17x and 3403x for CPU, respectively, over the ranges of inputs and third-party libraries tested. Compared to SciPy, the algorithms have median and maximum speedups of 77x and 300x for GPU and 35x and 98x for CPU, respectively, over the tested inputs. The ability to robustly compute a solution and the low relative errors allow us to fit von Mises-Fisher, vMF, distributions to high-dimensional neural network features. This is, e.g., relevant for uncertainty quantification in metric learning. We obtain image feature data by processing CIFAR10 training images with the convolutional layers of a pre-trained ResNet50. We successfully fit vMF distributions to 2048-, 8192-, and 32768-dimensional image feature data using our algorithms. Our approach provides fast and accurate results while existing implementations in SciPy and mpmath fail to fit successfully. Our approach is readily implementable on GPUs, and we provide a fast open-source implementation alongside this paper.

Paper Structure (21 sections, 1 theorem, 32 equations, 4 figures, 8 tables, 1 algorithm)

This paper contains 21 sections, 1 theorem, 32 equations, 4 figures, 8 tables, 1 algorithm.

Introduction
Related Work
Applications of Bessel Functions.
Calculating Bessel Functions.
Calculating Logarithm of Bessel Functions.
Theory
Modified Bessel function of the first kind
Modified Bessel function of the second kind
Algorithms
Choosing the right expressions
Pseudo code
GPU specific optimizations
Data and experimental setup for numerical experiments
Test Regions
Test Points Sampling
...and 6 more sections

Key Result

Corollary 1

The series in eq: Iv infinite series converges absolutely for all inputs.

Figures (4)

Figure 1: Comparison of the robustness and runtime of Scipy's scaled implementation and our library for calculating the logarithm of the modified Bessel function of the first kind.
Figure 2: Cumulative distribution of relative errors for our and the three third-party libraries when computing $\log I_v(x)$ for the Small region. The plot illustrates the comparative performance in terms of numerical precision. Our library has a steeper curve, indicating a higher proportion of outputs with minimal relative errors, demonstrating our library's higher precision in evaluating the logarithm of the modified Bessel function of the first kind over the entire range tested.
Figure 3: Metric learning pipeline. Images from CIFAR10 ( \ref{['fig:cifar10 examples']}) are resized to three different sizes of $D=32$, $D=64$, and $D=128$, and passed through the convolutional layers from ResNet50 to extract image features. We fit a vMF distribution to the $l_2$-normalized features. The extracted features have 2048, 8192, and 32768 dimensions, respectively.
Figure 4: Example images from the CIFAR10 dataset.

Theorems & Definitions (1)

Corollary 1

Accurate Computation of the Logarithm of Modified Bessel Functions on GPUs

TL;DR

Abstract

Accurate Computation of the Logarithm of Modified Bessel Functions on GPUs

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (4)

Theorems & Definitions (1)