Table of Contents
Fetching ...

Transfer learning of GW-Bethe-Salpeter Equation excitation energies

Dario Baum, Arno Förster, Lucas Visscher

TL;DR

This work tackles the data scarcity barrier in many-body perturbation theory by showing that graph neural networks pretrained on abundant low-fidelity data ($DFT$ MO energies and $TDDFT$ excitations) can be finetuned with limited high-fidelity labels ($qsGW$ and $qsGW$-BSE) to predict quasiparticle and excitation energies with MBPT-like accuracy. Using ViSNet, the authors compare full-model and readout-only finetuning across diverse test sets, finding that multi-fidelity pretraining improves generalization, reduces the amount of high-fidelity data required, and mitigates large outliers. They also demonstrate cross-property transfer benefits when pretrained and target properties are related, and extend the approach to $qsGW$-BSE with TDDFT pretraining often outperforming DFT pretraining. Overall, the study offers a data-efficient pathway to MBPT-quality excited-state predictions across chemical space, enabling rapid ML-driven screening while underscoring the importance of pretraining target alignment and dataset diversity.

Abstract

A persistent challenge in machine learning for electronic-structure calculations is the sharp imbalance between abundant low-fidelity data like DFT or TDDFT results and the scarcity of high-fidelity data like many-body perturbation theory labels. We show that transfer learning provides an effective route to bridge this gap: graph neural networks pretrained on DFT and TDDFT properties can be finetuned with limited qs$GW$ and qs$GW$-BSE data to yield accurate predictions of quasiparticle and excitation energies. Assessing both full-model and readout-only finetuning across chemically diverse test sets, we find that pretraining improves accuracy, reduces reliance on costly qs$GW$ data, and mitigates large predictive outliers even for molecules larger or chemically distinct from those seen during finetuning. Our results demonstrate that multi-fidelity transfer learning can substantially extend the reach of many-body-level predictions across chemical space.

Transfer learning of GW-Bethe-Salpeter Equation excitation energies

TL;DR

This work tackles the data scarcity barrier in many-body perturbation theory by showing that graph neural networks pretrained on abundant low-fidelity data ( MO energies and excitations) can be finetuned with limited high-fidelity labels ( and -BSE) to predict quasiparticle and excitation energies with MBPT-like accuracy. Using ViSNet, the authors compare full-model and readout-only finetuning across diverse test sets, finding that multi-fidelity pretraining improves generalization, reduces the amount of high-fidelity data required, and mitigates large outliers. They also demonstrate cross-property transfer benefits when pretrained and target properties are related, and extend the approach to -BSE with TDDFT pretraining often outperforming DFT pretraining. Overall, the study offers a data-efficient pathway to MBPT-quality excited-state predictions across chemical space, enabling rapid ML-driven screening while underscoring the importance of pretraining target alignment and dataset diversity.

Abstract

A persistent challenge in machine learning for electronic-structure calculations is the sharp imbalance between abundant low-fidelity data like DFT or TDDFT results and the scarcity of high-fidelity data like many-body perturbation theory labels. We show that transfer learning provides an effective route to bridge this gap: graph neural networks pretrained on DFT and TDDFT properties can be finetuned with limited qs and qs-BSE data to yield accurate predictions of quasiparticle and excitation energies. Assessing both full-model and readout-only finetuning across chemically diverse test sets, we find that pretraining improves accuracy, reduces reliance on costly qs data, and mitigates large predictive outliers even for molecules larger or chemically distinct from those seen during finetuning. Our results demonstrate that multi-fidelity transfer learning can substantially extend the reach of many-body-level predictions across chemical space.

Paper Structure

This paper contains 10 sections, 6 figures.

Figures (6)

  • Figure 1: MAE of qs$GW$ QP HOMO predictions from small (a) and large (b) models pretrained on different numbers of DFT samples.
  • Figure 2: Per-sample absolute errors of qs$GW$ QP HOMO energy predictions from small (a) and large (b) models with and without pretraining.
  • Figure 3: MAE of QP HOMO energy, QP LUMO energy and QP gap predictions with pretraining on different QP energy targets.
  • Figure 4: MAEs of QP HOMO energy predictions after finetuning on different numbers of samples with and without prior pretraining normalized to the respective MAE when finetuning on the full finetuning-set (120 000 samples).
  • Figure 5: MAE of qs$GW$-BSE excitation energy predictions after pretraining on DFT and TDDFT data.
  • ...and 1 more figures