lrux: Fast low-rank updates of determinants and Pfaffians in JAX
Ao Chen, Christopher Roth
TL;DR
The paper tackles the bottleneck of repeatedly evaluating antisymmetric wavefunctions in quantum Monte Carlo by introducing lrux, a JAX-based library that performs fast low-rank updates for determinants and Pfaffians. By applying the matrix determinant lemma and its Pfaffian analogues, lrux reduces the update cost from $\mathcal{O}(n^3)$ to $\mathcal{O}(n^2 k)$ for update rank $k$, and adds delayed-update strategies to further minimize memory traffic on modern accelerators. It provides concrete, GPU-friendly interfaces with real and complex data support and one-hot representations, along with extensive benchmarks showing up to $\sim 1000\times$ speedups for Pfaffians at large $n$, and practical guidance on choosing delay bounds. The work positions lrux as a scalable, drop-in component for QMC workflows and antisymmetric wavefunction evaluations, enabling larger-scale simulations and more efficient variational approaches. Overall, lrux advances high-performance quantum chemistry and condensed-m matter computations by delivering robust, hardware-aware LRU capabilities for determinants and Pfaffians in a JAX ecosystem.
Abstract
We present lrux, a JAX-based software package for fast low-rank updates of determinants and Pfaffians, targeting the dominant computational bottleneck in various quantum Monte Carlo (QMC) algorithms. The package implements efficient low-rank updates that reduce the cost of successive wavefunction evaluations from $\mathcal{O}(n^3)$ to $\mathcal{O}(n^2k)$ when the update rank $k$ is smaller than the dimension $n$ of matrices. Both determinant and Pfaffian updates are supported, together with delayed-update strategies that trade floating-point operations for reduced memory traffic on modern accelerators. lrux natively integrates with JAX transformations such as JIT compilation, vectorization, and automatic differentiation, and supports both real and complex data types. Benchmarks on GPUs demonstrate up to $1000\times$ speedup at large matrix sizes. lrux enables scalable, high-performance evaluation of antisymmetric wavefunctions and is designed as a drop-in component for a wide range of QMC workflows. lrux is available at https://github.com/ChenAo-Phys/lrux.
