Table of Contents
Fetching ...

Minimising the Demand for High-Fidelity Training Data towards Chemically Accurate Adsorption Energy Predictions

Zhihao Zhang, Xiao-Ming Cao

TL;DR

DOTA is presented, a functional-independent deep learning model established on the map between local density of states (LDOS) and adsorption energy that provides a robust framework for efficient catalyst and electrode screening, bridging the gap between computational and experimental data.

Abstract

Adsorption energy is a critical descriptor for high-throughput screening of heterogeneous catalysts and electrode materials. However, precise experimental data are scarce due to the complexity of experiments, while high-fidelity density functional theory (DFT) calculations remain computationally expensive for large-scale material screening. Machine learning models trained on DFT data have emerged as a promising alternative but face challenges such as functional dependency and limited high-fidelity labelled data. Herein, we present DOS Transformer for Adsorption (DOTA), a functional-independent deep learning model established on the map between local density of states (LDOS) and adsorption energy. DOTA integrates multi-head self-attention mechanisms with LDOS feature engineering to capture latent orbital interaction patterns, enabling it to unify multi-fidelity and multi-source data. This minimises the demand for high-fidelity training data. Consequently, the predictive adsorption energy could reach chemical accuracy, requiring less than five high-fidelity experimental adsorption energies for model training. DOTA also resolves long-standing challenges, such as the "CO puzzle", and outperforms traditional theories, including the d-band centre and Fermi softness models. It provides a robust framework for efficient catalyst and electrode screening, bridging the gap between computational and experimental data.

Minimising the Demand for High-Fidelity Training Data towards Chemically Accurate Adsorption Energy Predictions

TL;DR

DOTA is presented, a functional-independent deep learning model established on the map between local density of states (LDOS) and adsorption energy that provides a robust framework for efficient catalyst and electrode screening, bridging the gap between computational and experimental data.

Abstract

Adsorption energy is a critical descriptor for high-throughput screening of heterogeneous catalysts and electrode materials. However, precise experimental data are scarce due to the complexity of experiments, while high-fidelity density functional theory (DFT) calculations remain computationally expensive for large-scale material screening. Machine learning models trained on DFT data have emerged as a promising alternative but face challenges such as functional dependency and limited high-fidelity labelled data. Herein, we present DOS Transformer for Adsorption (DOTA), a functional-independent deep learning model established on the map between local density of states (LDOS) and adsorption energy. DOTA integrates multi-head self-attention mechanisms with LDOS feature engineering to capture latent orbital interaction patterns, enabling it to unify multi-fidelity and multi-source data. This minimises the demand for high-fidelity training data. Consequently, the predictive adsorption energy could reach chemical accuracy, requiring less than five high-fidelity experimental adsorption energies for model training. DOTA also resolves long-standing challenges, such as the "CO puzzle", and outperforms traditional theories, including the d-band centre and Fermi softness models. It provides a robust framework for efficient catalyst and electrode screening, bridging the gap between computational and experimental data.

Paper Structure

This paper contains 16 sections, 4 figures.

Figures (4)

  • Figure 1: a) Procedure from coordination to energy. b) Pretraining and fine-tuning workflow of DOTA exemplified by CO adsorption. c) Illustration of DOTA, with PDOS partition in the lavender area, orbital interaction in the yellow area, and output $E_{\rm{ad}}$ in the blue area.
  • Figure 2: a) Parity plot and kernel density plot of adsorption energies between DFT(PBE) calculated and DOTA-PBE predicted from five-fold cross-validation. b) Plot of d-band centre versus DFT(PBE) hydroxyl $E_{\rm{ad}}$. c) Plot of DFT(PBE) hydroxyl $E_{\rm{ad}}$ versus DOTA-PBE predicted energy.
  • Figure 3: a) LDOSs of surface Ag in Ag(111) and surface Au in Au(111), b) the contribution of each energy level of surface Ag in Ag(111) and surface Au in Au(111) to the OH adsorption atop Ag(111) and Au(111). c) LDOSs of surface Pt and Y atoms in $\rm Pt_3Y$(111). d) the contribution of each energy level of surface Pt and Y to H adsorption atop $\rm Pt_3Y$(111).
  • Figure 4: a) Cohesive energies and predicted adsorption energies of various combinations of functionals on Pt(111). b) Cohesive energies and predicted adsorption energies of various combinations of functionals on Rh(111). Only the adsorption energies for the preferred site are marked, and the difference between the top and fcc sites is indicated by a colour gradient from blue to red. Each prediction using the combination of functionals by DOTA is labelled following the format of the functional used for the adsorbate / the functional for the metal slab. The standard quantum chemistry calculation results are labelled with the functionals or methods. The experience result is also marked with the label ‘exp.’. c) Experimental versus predicted CO adsorption energies and dissociative adsorption energies of $\rm H_2$ and $\rm O_2$. Results are shown for various metal (111) surfaces: Rh Ads_Rh, Au Ads_Au, Pt Ads_Pt, Pd Ads_Pd, Cu Ads_Cu, Ir Ads_Ir, Ag Ads_Ag, Ni Ads_Ni, and Co Ads_Co. The circle and square correspond to the adsorption at the top site and the fcc site, respectively. All data points except the Cu(111)-fcc value are experimental values from the reference of 3_1. The $E_{\rm{ad}}$ of the fcc site on Cu(111) was from Ads_CuFCC, utilising highly accurate quantum-embedded multi-configuration pair-density functional theory (emb-MC-PDFT).