Machine-learning interatomic potentials achieving CCSD(T) accuracy for systems with extended covalent networks and van der Waals interactions
Yuji Ikeda, Axel Forslund, Pranav Kumar, Yongliang Ou, Jong Hyun Jung, Andreas Köhn, Blazej Grabowski
TL;DR
This work addresses the challenge of attaining CCSD(T)-level accuracy for materials with extended covalent networks and vdW interactions by developing a Δ-learning interatomic potential built on a dispersion-corrected tight-binding baseline. The authors train a Moment Tensor Potential to predict the CCSD(T)-F12 energy corrections on molecular fragments and combine it with GFN2-xTB baseline energies, enabling CCSD(T)-quality PESs for periodic systems like covalent-organic frameworks (COFs). The TB+$\Delta$MTP model achieves RMSEs around $<0.4$ meV/atom on training and test sets, and reproduces electronic total atomization energies, bond lengths, vibrational frequencies, and intermolecular energies with CCSD(T)-level accuracy, demonstrated on H2, C6H6, and the C48H30 COF, including inter-layer binding and H2 adsorption. This methodology provides a practical route for large-scale, CCSD(T)-accurate simulations and high-throughput CCSD(T) screening of vdW-dominated materials such as COFs.
Abstract
Machine-learning interatomic potentials (MLIPs) enable large-scale atomistic simulations at moderate computational cost while retaining ab initio accuracy. MLIPs trained on coupled-cluster data, particularly CCSD(T), have emerged a promising route to achieve chemical accuracy beyond the limits of density functional theory (DFT) and to incorporate non-empirical van der Waals (vdW) interactions. Most existing approaches are, however, still not straightforwardly applicable for systems with extended covalent networks such as covalent organic frameworks (COFs) due to the limited availability of CCSD(T) for periodic systems. Here we present a methodology to train MLIPs with CCSD(T) accuracy for these systems. The approach uses the Δ-learning method with a dispersion-corrected tight-binding baseline. This strategy enables training on compact molecular fragments while preserving transferability toward the periodic systems. Dispersion interactions are accounted for by adding vdW-bound multimers in the training set, and the combination with a vdW-aware tight-binding baseline allows the formally local MLIP to attain CCSD(T)-level accuracy even for systems dominated by long-range vdW forces. The resulting potential yields root-mean-square energy errors below 0.4 meV/atom on training and test sets and reproduces electronic total atomization energies, bond lengths, harmonic vibrational frequencies, and inter-molecular interaction energies for benchmark molecular systems. We apply the method to a prototypical quasi-two-dimensional COF composed of carbon and hydrogen. The COF structure, inter-layer binding energies, and hydrogen absorption are analyzed at CCSD(T) accuracy. The developed methodology opens a practical route to large-scale atomistic simulations for systems with extended covalent networks and vdW interactions with chemical accuracy.
