Scalable Training of Neural Network Potentials for Complex Interfaces Through Data Augmentation

In Won Yeu; Annika Stuke; Jon L. pez-Zorrilla; James M. Stevenson; David R. Reichman; Richard A. Friesner; Alexander Urban; Nongnuch Artrith

Scalable Training of Neural Network Potentials for Complex Interfaces Through Data Augmentation

In Won Yeu, Annika Stuke, Jon L. pez-Zorrilla, James M. Stevenson, David R. Reichman, Richard A. Friesner, Alexander Urban, Nongnuch Artrith

TL;DR

The paper addresses the high data and computational demands of training neural network potentials for complex material interfaces. It introduces a GPR-based data-augmentation strategy (GPR-ANN) that indirectly incorporates force information by generating synthetic energies from local GPR surrogates, enabling efficient energy-only training with uncertainty estimates for active learning. Across H2, EC dimers, and EC on Li metal surfaces, GPR-ANN achieves accuracy and robustness comparable to direct force training but with substantially reduced memory and compute requirements, thanks to local GPR models and scalable augmentation factors. This approach yields a practical, scalable pathway to high-fidelity interfacial potentials, facilitating large-scale simulations relevant to battery interfaces and other heterogeneous condensed-phase systems.

Abstract

Artificial neural network (ANN) potentials enable highly accurate atomistic simulations of complex materials at unprecedented scales. Despite their promise, training ANN potentials to represent intricate potential energy surfaces (PES) with transferability to diverse chemical environments remains computationally intensive, especially when atomic force data are incorporated to improve PES gradients. Here, we present an efficient ANN potential training methodology that uses Gaussian process regression (GPR) to incorporate atomic forces into ANN training, leading to accurate PES models with fewer additional first-principles calculations and a reduced computational effort for training. Our GPR-ANN approach generates synthetic energy data from force information in the reference dataset, thus augmenting the training datasets and bypassing direct force training. Benchmark tests on hybrid density-functional theory data for ethylene carbonate (EC) molecules and Li metal-EC interfaces, relevant for lithium metal battery applications, demonstrate that GPR-ANN potentials achieve accuracies comparable to fully force-trained ANNs with a significantly reduced computational overhead. Detailed comparisons show that the method improves both data efficiency and scalability for complex interfaces and heterogeneous environments. This work establishes the GPR-ANN method as a powerful and scalable framework for constructing high-fidelity machine learning interatomic potentials, offering the computational and memory efficiency critical for the large-scale simulations needed for the simulation of materials interfaces.

Scalable Training of Neural Network Potentials for Complex Interfaces Through Data Augmentation

TL;DR

Abstract

Scalable Training of Neural Network Potentials for Complex Interfaces Through Data Augmentation

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (26)