Table of Contents
Fetching ...

Data-driven construction of machine-learning-based interatomic potentials for gas-surface scattering dynamics: the case of NO on graphite

Samuel Del Fré, Gilberto A. Alou Angulo, Maurice Monnerville, Alejandro Rivero Santamaría

Abstract

Accurate atomistic simulations of gas-surface scattering require potential energy surfaces that remain reliable over broad configurational and energetic ranges while retaining the efficiency needed for extensive trajectory sampling. Here, we develop a data-driven workflow for constructing a machine-learning interatomic potential (MLIP) tailored to gas-surface scattering dynamics, using nitric oxide (NO) scattering from highly oriented pyrolytic graphite (HOPG) as a benchmark system. Starting from an initial ab initio molecular dynamics (AIMD) dataset, local atomic environments are described by SOAP descriptors and analyzed in a reduced feature space obtained through principal component analysis. Farthest point sampling is then used to build a compact training set, and the resulting Deep Potential model is refined through a query-by-committee active-learning strategy using additional configurations extracted from molecular dynamics simulations over extended ranges of incident energies and surface temperatures. The final MLIP reproduces reference energies and forces with high fidelity and enables large-scale molecular dynamics simulations of NO scattering from graphite at a computational cost far below that of AIMD. The simulations provide detailed insight into adsorption energetics, trapping versus direct scattering probabilities, translational energy loss, angular distributions, and rotational excitation. Overall, the results reproduce the main experimental trends and demonstrate that descriptor-guided sampling combined with active learning offers an efficient and transferable strategy for constructing MLIPs for gas-surface interactions.

Data-driven construction of machine-learning-based interatomic potentials for gas-surface scattering dynamics: the case of NO on graphite

Abstract

Accurate atomistic simulations of gas-surface scattering require potential energy surfaces that remain reliable over broad configurational and energetic ranges while retaining the efficiency needed for extensive trajectory sampling. Here, we develop a data-driven workflow for constructing a machine-learning interatomic potential (MLIP) tailored to gas-surface scattering dynamics, using nitric oxide (NO) scattering from highly oriented pyrolytic graphite (HOPG) as a benchmark system. Starting from an initial ab initio molecular dynamics (AIMD) dataset, local atomic environments are described by SOAP descriptors and analyzed in a reduced feature space obtained through principal component analysis. Farthest point sampling is then used to build a compact training set, and the resulting Deep Potential model is refined through a query-by-committee active-learning strategy using additional configurations extracted from molecular dynamics simulations over extended ranges of incident energies and surface temperatures. The final MLIP reproduces reference energies and forces with high fidelity and enables large-scale molecular dynamics simulations of NO scattering from graphite at a computational cost far below that of AIMD. The simulations provide detailed insight into adsorption energetics, trapping versus direct scattering probabilities, translational energy loss, angular distributions, and rotational excitation. Overall, the results reproduce the main experimental trends and demonstrate that descriptor-guided sampling combined with active learning offers an efficient and transferable strategy for constructing MLIPs for gas-surface interactions.
Paper Structure (18 sections, 1 equation, 9 figures, 1 table)

This paper contains 18 sections, 1 equation, 9 figures, 1 table.

Figures (9)

  • Figure 1: PCA projection of the SOAP descriptor space for the full AIMD dataset, with FPS-selected environments highlighted as blue triangles (NO) and dark green dots (graphite). The inset shows the coverage evolution.
  • Figure 2: Energy (upper panel) and force (lower panel) distributions for the full AIMD dataset, the FPS-sampled dataset A, the QBC-selected configurations, and the final refined dataset B. Force magnitudes are shown on a logarithmic scale.
  • Figure 3: Three-dimensional representation of the $N_{\mathrm{st}}^{\mathrm{QBC}}$ configurations selected during the active learning phase as a function of surface temperature $T_{\mathrm{surf}}$ and incident energy $E_{\mathrm{inc}}$ (upper panel). Distribution of the QBC model-deviation score ($\Delta F$) over the extracted configurations on a logarithmic scale, with the selected uncertainty window highlighted (lower panel).
  • Figure 4: Parity plots between MLIP predictions of energies (upper panel) and forces (lower panel) and the corresponding DFT reference data for validation set B)
  • Figure 5: Scattering probability $P_{\mathrm{scat}}$ obtained from molecular dynamics simulations. Top: dependence on the incident energy $E_{\mathrm{inc}}$ for $T_{\mathrm{surf}} = 100$ K. Bottom: dependence on the surface temperature $T_{\mathrm{surf}}$ for $E_{\mathrm{inc}} = 0.1$ eV.
  • ...and 4 more figures