Table of Contents
Fetching ...

GPU-accelerated Effective Hamiltonian Calculator

Abhishek Chakraborty, Taylor L. Patti, Brucek Khailany, Andrew N. Jordan, Anima Anandkumar

TL;DR

The paper introduces qCHeff, an open-source Python package that enables GPU-accelerated, numerically stable effective-Hamiltonian calculations for large quantum systems. It combines two complementary approaches: NPAD-based iterative Schrieffer-Wolff transformations for efficient block-diagonalization of time-independent problems, and a Magnus expansion-based time-evolution framework for accurate time-coarse-grained simulations of rapidly driven dynamics. The authors demonstrate substantial performance gains on GPUs—up to about 15x for NPAD and 300x for Magnus compared with CPU or direct QuTiP simulations—while maintaining high accuracy, validated on models such as the Jaynes-Cummings-Hubbard lattice, strongly driven qubits, and degenerate spin chains. These methods enable scalable analysis of high-dimensional quantum systems with interpretable effective dynamics and offer a path toward further GPU-accelerated, sparse, and higher-order extensions in quantum simulation and control.

Abstract

Effective Hamiltonian calculations for large quantum systems can be both analytically intractable and numerically expensive using standard techniques. In this manuscript, we present numerical techniques inspired by Nonperturbative Analytical Diagonalization (NPAD) and the Magnus expansion for the efficient calculation of effective Hamiltonians. While these tools are appropriate for a wide array of applications, we here demonstrate their utility for models that can be realized in circuit-QED settings. Our numerical techniques are available as an open-source Python package, ${\rm qCH_{eff}}$, which is available on GitHub (https://github.com/NVlabs/qCHeff) and PyPI (https://pypi.org/project/qcheff/). We use the CuPy library for GPU-acceleration and report up to 15x speedup on GPU over CPU for NPAD, and up to 42x speedup for the Magnus expansion (compared to QuTiP), for large system sizes.

GPU-accelerated Effective Hamiltonian Calculator

TL;DR

The paper introduces qCHeff, an open-source Python package that enables GPU-accelerated, numerically stable effective-Hamiltonian calculations for large quantum systems. It combines two complementary approaches: NPAD-based iterative Schrieffer-Wolff transformations for efficient block-diagonalization of time-independent problems, and a Magnus expansion-based time-evolution framework for accurate time-coarse-grained simulations of rapidly driven dynamics. The authors demonstrate substantial performance gains on GPUs—up to about 15x for NPAD and 300x for Magnus compared with CPU or direct QuTiP simulations—while maintaining high accuracy, validated on models such as the Jaynes-Cummings-Hubbard lattice, strongly driven qubits, and degenerate spin chains. These methods enable scalable analysis of high-dimensional quantum systems with interpretable effective dynamics and offer a path toward further GPU-accelerated, sparse, and higher-order extensions in quantum simulation and control.

Abstract

Effective Hamiltonian calculations for large quantum systems can be both analytically intractable and numerically expensive using standard techniques. In this manuscript, we present numerical techniques inspired by Nonperturbative Analytical Diagonalization (NPAD) and the Magnus expansion for the efficient calculation of effective Hamiltonians. While these tools are appropriate for a wide array of applications, we here demonstrate their utility for models that can be realized in circuit-QED settings. Our numerical techniques are available as an open-source Python package, , which is available on GitHub (https://github.com/NVlabs/qCHeff) and PyPI (https://pypi.org/project/qcheff/). We use the CuPy library for GPU-acceleration and report up to 15x speedup on GPU over CPU for NPAD, and up to 42x speedup for the Magnus expansion (compared to QuTiP), for large system sizes.

Paper Structure

This paper contains 19 sections, 20 equations, 8 figures.

Figures (8)

  • Figure 1: (a) Overview of the qCHeff package API. (b) The NPAD algorithm implements an iterative Schrieffer-Wolff transformation. Givens rotations Givens1958 exactly diagonalize two-level subspaces of the full Hamiltonian and iteration can be stopped early once the desired subspace Hamiltonian/eigenvalues is obtained. (c) The Magnus expansion can accurately and efficiently simulate time-evolution for quantum systems with rapid time dependence.
  • Figure 2: (Top) Schematic showing the setup for the Jaynes-Cummings-Hubbard (JCH) model, sometimes also called the Jaynes-Cummings (JC) lattice model. Each lattice site (blue circle) has a linear resonator (frequency $\omega$) coupled to an atom (frequency $\epsilon$) with strength $g$. Photons can hop between neighboring resonators at a rate $\kappa$. (Bottom) Uncoupled and coupled JC spectrum.
  • Figure 3: (Top) Chemical potential boundaries between different Mott Lobes for the JCH model in the atomic limit ($\kappa/g\ll 1$), computed using NPAD. Solid lines indicate numerical results and crosses ($\times$) mark values predicted by the analytical expression given by (Eq. \ref{['eq:mott-lobes-theory']}). (Bottom) Relative error for NPAD and numerical diagonalization compared to the exact analytical expression (Eq. \ref{['eq:mott-lobes-theory']}). The solid lines indicate the error averaged over all energy levels/Mott lobes for a given detuning and the shaded region indicates the $95\%$ confidence interval around the mean. In this problem, NPAD has the added benefit of greater numerical precision over exact diagonalization due to fewer numerical operations. NPAD performs only one Givens rotations to calculate each eigenvalue/Mott lobe, which is exact in this case due to the structure and symmetry of the problem. On the other hand, numerical diagonalization solves for the matrix eigenvalues using general diagonalization techniques thus requiring more floating point operations, leading to a higher floating point error. Both methods have error below $10^{-12}$ which does not qualitatively affect predictions in typical scenarios.
  • Figure 4: Comparing performance of NPAD and numerical diagonalization on both CPU and GPU. The test sparse Hamiltonian matrix is $H_{\rm test} = a^\dagger a + (a + a^\dagger)$, where $a (a^\dagger)$ is a bosonic mode annihilation (creation) operator truncated to $N$ levels. For NPAD, we benchmark the running time of applying 5 NPAD iterations each with a single Givens rotation to test Hamiltonians of various sizes. We impose a 1 timeout for each operation on a matrix of a given size and report only parts which finish within this timeout: numerical diagonalization up to matrix dimension around $10^4$ on CPU and $10^5$ on GPU, and NPAD on CPU up to a matrix dimension around $10^6$. On both CPU and GPU, NPAD is roughly an order of magnitude faster compared to numerical diagonalization on the same type of device. The running time for NPAD on GPU for smaller matrices is largely determined by the baseline execution latency on GPUs and remains roughly constant up to a matrix dimension of $10^6$, beyond which the matrix size in memory is comparable to the total available VRAM. A similar plateau is seen on the CPU, but for small matrices. As the matrix size grows in CPU memory, the running time increases rapidly. NPAD is roughly an order of magnitude faster on GPUs, which are optimized for sparse matrix multiplication, compared to CPUs.
  • Figure 5: Comparing the Magnus expansion to the RWA for a strongly driven qubit with drive strength is $\Omega_0/\omega = 0.33$, $\omega$ being the qubit frequency. On resonance, the RWA is no longer a good approximation, however the Magnus expansion predicts the correct stroboscopic time-evolution. The full simulation is done with 5000 time steps while the Magnus time-evolution uses just 50 Magnus intervals.
  • ...and 3 more figures