Transfer learning of GW-Bethe-Salpeter Equation excitation energies
Dario Baum, Arno Förster, Lucas Visscher
TL;DR
This work tackles the data scarcity barrier in many-body perturbation theory by showing that graph neural networks pretrained on abundant low-fidelity data ($DFT$ MO energies and $TDDFT$ excitations) can be finetuned with limited high-fidelity labels ($qsGW$ and $qsGW$-BSE) to predict quasiparticle and excitation energies with MBPT-like accuracy. Using ViSNet, the authors compare full-model and readout-only finetuning across diverse test sets, finding that multi-fidelity pretraining improves generalization, reduces the amount of high-fidelity data required, and mitigates large outliers. They also demonstrate cross-property transfer benefits when pretrained and target properties are related, and extend the approach to $qsGW$-BSE with TDDFT pretraining often outperforming DFT pretraining. Overall, the study offers a data-efficient pathway to MBPT-quality excited-state predictions across chemical space, enabling rapid ML-driven screening while underscoring the importance of pretraining target alignment and dataset diversity.
Abstract
A persistent challenge in machine learning for electronic-structure calculations is the sharp imbalance between abundant low-fidelity data like DFT or TDDFT results and the scarcity of high-fidelity data like many-body perturbation theory labels. We show that transfer learning provides an effective route to bridge this gap: graph neural networks pretrained on DFT and TDDFT properties can be finetuned with limited qs$GW$ and qs$GW$-BSE data to yield accurate predictions of quasiparticle and excitation energies. Assessing both full-model and readout-only finetuning across chemically diverse test sets, we find that pretraining improves accuracy, reduces reliance on costly qs$GW$ data, and mitigates large predictive outliers even for molecules larger or chemically distinct from those seen during finetuning. Our results demonstrate that multi-fidelity transfer learning can substantially extend the reach of many-body-level predictions across chemical space.
