Table of Contents
Fetching ...

GPU Accelerated Minimal Auxiliary Basis Approach TDDFT for Large Organic Molecules

Zehao Zhou, Xiaojie Wu, Yanheng Li, Xinran Wei, Cheng Fan, Fusong Ju, Qiming Sun, Yi Qin Gao

Abstract

We introduce a GPU-accelerated implementation of time-dependent density functional theory with the minimal auxiliary basis approach (TDDFT-risp) in GPU4PySCF, together with large system demonstrations carried out using the Tamm--Dancoff approximation (TDA-risp). The method combines GPU-accelerated three-center integral evaluation, tensor contractions, exchange-space truncation, omission of hydrogen atoms from the auxiliary basis, and a host memory assisted Davidson solver. On the EXTEST42 benchmark set, a conservative 40 eV exchange cutoff yields excitation-energy errors relative to standard TDA of about 0.03--0.05 eV for low-lying states. For systems of 300 to 3000 atoms, we demonstrate that TDA-risp calculations of 15 low-lying excited states with $ω$B97XD/def2-SVP complete on a single A100 GPU with wall times ranging from minutes to hours. These results position GPU-TDDFT-risp as a practical route toward excited-state calculations for large organic and biomolecular systems with thousands of atoms.

GPU Accelerated Minimal Auxiliary Basis Approach TDDFT for Large Organic Molecules

Abstract

We introduce a GPU-accelerated implementation of time-dependent density functional theory with the minimal auxiliary basis approach (TDDFT-risp) in GPU4PySCF, together with large system demonstrations carried out using the Tamm--Dancoff approximation (TDA-risp). The method combines GPU-accelerated three-center integral evaluation, tensor contractions, exchange-space truncation, omission of hydrogen atoms from the auxiliary basis, and a host memory assisted Davidson solver. On the EXTEST42 benchmark set, a conservative 40 eV exchange cutoff yields excitation-energy errors relative to standard TDA of about 0.03--0.05 eV for low-lying states. For systems of 300 to 3000 atoms, we demonstrate that TDA-risp calculations of 15 low-lying excited states with B97XD/def2-SVP complete on a single A100 GPU with wall times ranging from minutes to hours. These results position GPU-TDDFT-risp as a practical route toward excited-state calculations for large organic and biomolecular systems with thousands of atoms.

Paper Structure

This paper contains 16 sections, 21 equations, 12 figures, 2 tables.

Figures (12)

  • Figure 1: MO-basis three-center ERI tensors. Orange $\mathbf{T}_{ia}^P$ for Coulomb, green $\mathbf{T}_{ij}^{P'}$ and $\mathbf{T}_{ab}^{P'}$ for exchange, and $\mathbf{T}_{ia}^{P'}$ exclusively for full TDDFT exchange. $N_\text{auxK}$ and $N_\text{auxJ}$ are the number of auxiliary basis functions for the exchange and Coulomb terms, respectively. $N_\text{occK'}$ and $N_\text{virtK'}$ are the number of occupied and virtual MOs after truncation, thus less than the standard amount $N_\text{occ}$ and $N_\text{vir}$.
  • Figure 2: The Davidson diagonalization flow chart with GPU acceleration. The yellow areas indicate major CPU memory-resident quantities. The red areas highlight intensive GPU computations and CPU--GPU data transfers.
  • Figure 3: Average 20-state excitation-energy RMSE (TDA-risp vs. standard TDA) on the EXTEST42 set as a function of the exchange-window truncation threshold.
  • Figure 4: RMSE of the 20 lowest excitation energies (TDA-risp vs. standard TDA) versus molecular size for the EXTEST42 set.
  • Figure 5: Average S_1 Energy MAE and average RMSE of the 20 lowest excitations for TDA-risp on the EXTEST42 40--99 atom subset.
  • ...and 7 more figures