Table of Contents
Fetching ...

AiiDA-TrainsPot: Towards automated training of neural-network interatomic potentials

Davide Bidoggia, Nataliia Manko, Maria Peressi, Antimo Marrazzo

TL;DR

AiiDA-TrainsPot presents a fully automated, code-agnostic workflow for training neural-network interatomic potentials by integrating DFT labeling, dataset augmentation, and MD-based exploration within the AiiDA provenance framework. The method relies on a calibrated committee-disagreement criterion to selectively label configurations, enabling data-efficient active learning that scales to diverse materials such as carbon allotropes and W$_x$Mo$_{1-x}$Te$_2$ monolayers, and even fine-tuning foundation models. Validation demonstrates strong accuracy and transferability, with two carbon-validation campaigns showing RMSEs in the meV–Å range and the ability to capture vibrational properties and defect energetics, while alloy benchmarks highlight robust phase-stability predictions. The work emphasizes reproducibility, modularity, and extensibility, offering a practical path toward democratizing access to high-accuracy NNIPs and enabling integration with future data-driven materials-design pipelines.

Abstract

Crafting neural-network interatomic potentials (NNIPs) remains a complex task, demanding specialized expertise in both machine learning and electronic-structure calculations. Here, we introduce AiiDA-TrainsPot, an automated, open-source, and user-friendly workflow that streamlines the creation of accurate NNIPs by orchestrating density-functional-theory calculations, data augmentation strategies, and classical molecular dynamics. Our active-learning strategy leverages on-the-fly calibration of committee disagreement against ab initio reference errors to ensure reliable uncertainty estimates. We use electronic-structure descriptors and dimensionality reduction to analyze the efficiency of this calibrated criterion, and show that it minimizes both false positives and false negatives when deciding what to compute from first principles. AiiDA-TrainsPot has a modular design that supports multiple NNIP backends, enabling both the training of NNIPs from scratch and the fine-tuning of foundation models. We demonstrate its capabilities through automated training campaigns targeting pristine and defective carbon allotropes, including amorphous carbon, as well as structural phase transitions in monolayer $\mathrm{W_xMo_{1-x}Te_2}$ alloys.

AiiDA-TrainsPot: Towards automated training of neural-network interatomic potentials

TL;DR

AiiDA-TrainsPot presents a fully automated, code-agnostic workflow for training neural-network interatomic potentials by integrating DFT labeling, dataset augmentation, and MD-based exploration within the AiiDA provenance framework. The method relies on a calibrated committee-disagreement criterion to selectively label configurations, enabling data-efficient active learning that scales to diverse materials such as carbon allotropes and WMoTe monolayers, and even fine-tuning foundation models. Validation demonstrates strong accuracy and transferability, with two carbon-validation campaigns showing RMSEs in the meV–Å range and the ability to capture vibrational properties and defect energetics, while alloy benchmarks highlight robust phase-stability predictions. The work emphasizes reproducibility, modularity, and extensibility, offering a practical path toward democratizing access to high-accuracy NNIPs and enabling integration with future data-driven materials-design pipelines.

Abstract

Crafting neural-network interatomic potentials (NNIPs) remains a complex task, demanding specialized expertise in both machine learning and electronic-structure calculations. Here, we introduce AiiDA-TrainsPot, an automated, open-source, and user-friendly workflow that streamlines the creation of accurate NNIPs by orchestrating density-functional-theory calculations, data augmentation strategies, and classical molecular dynamics. Our active-learning strategy leverages on-the-fly calibration of committee disagreement against ab initio reference errors to ensure reliable uncertainty estimates. We use electronic-structure descriptors and dimensionality reduction to analyze the efficiency of this calibrated criterion, and show that it minimizes both false positives and false negatives when deciding what to compute from first principles. AiiDA-TrainsPot has a modular design that supports multiple NNIP backends, enabling both the training of NNIPs from scratch and the fine-tuning of foundation models. We demonstrate its capabilities through automated training campaigns targeting pristine and defective carbon allotropes, including amorphous carbon, as well as structural phase transitions in monolayer alloys.

Paper Structure

This paper contains 13 sections, 6 equations, 11 figures, 1 table.

Figures (11)

  • Figure 1: Schematic representation of the AiiDA-TrainsPot automated workflow. Initial input structures ({$\mathcal{X}\}^{(0)}$) can be augmented by creating configurations with random distortions, strain, vacancies, cluster and slab extraction. Ab initio calculations are performed on these configurations ({$\mathcal{X}_i$}) that are thus labeled by energies, forces, and stress tensors which constitute the reference data ({$\mathcal{L}_i$}). A committee of potentials ({$\Phi_j$}) is trained on these configurations and committee evaluation is used to compare their predictions ($\mathcal{L}_k(\Phi_j)$) on structures ($\mathcal{X}_k$) that are extracted from MD trajectories at different temperatures and pressures obtained through one of the potentials of the committee ($\Phi_1$). If for a structure $\mathcal{X}_k$ the disagreement ($\mathcal{D}_k$) averaged over the committee exceeds a threshold, the structure is added to the reference training dataset and the ab initio labelling is performed. This iterative process continues until convergence is achieved and workflow outputs labeled structures, trained potentials, and root mean square errors (RMSE) for energies, forces, and stress tensor components.
  • Figure 2: Schematic representation of the TrainsPotWorkChain with its various computational tasks. The workflow begins with the initialization phase, where input structures, parameters, and control flags are set to determine which steps of the workflow will be executed. If dataset augmentation is enabled, the DatasetAugmentationWorkChain generates additional configurations. Next, the workflow enters the active learning loop. Within each iteration, the AbInitioLabellingWorkChain is called to label newly generated configurations by performing automated electronic structure calculations (PwBaseWorkChain). The TrainingWorkChain is then invoked to train interatomic potentials using MACE (MaceTrainWorkChain) or Metatrain (MetaTrainWorkChain). Subsequently, the ExplorationWorkChain executes MD simulations via LAMMPS (LammpsBaseWorkChain) to generate new configurations for further refinement. The EvaluationCalculation assesses the performance of the trained models using calibrated committee disagreement to determine whether additional iterations are required or not.
  • Figure 3: Schematic representation of AiiDA-TrainsPot for different use cases. Users can execute individual subtasks or the entire workflow depending on their specific requirements. The workflow can start from Dataset Augmentation to expand data diversity, Ab Initio Labelling to perform DFT-based calculations of energy, force, and stress tensor, or Training to generate or improve existing machine-learned interatomic potentials. Users may also initiate Exploration for MD simulations or Evaluation to assess accuracy and performance.
  • Figure 4: Evolution of model accuracy over active learning. Top panels: RMSE for energies, forces, and stress tensor components are shown across training, validation, and test sets. Bottom panels: Parity plots comparing NNIP predictions (latest iteration of active learning for both runs) to DFT reference values on the final test set. Cold and warm colors identify the results of run A ("fast exploration") and run B ("accuracy and data-efficiency"), respectively. Error bars in the top panels represent the standard deviation across the model committee. Insets in the bottom panels show the error distribution histograms.
  • Figure 5: Exploration of the potential energy surface (PES) and uncertainty quantification (run A). Left panel: t-SNE visualization of SOREP electronic-structure descriptors colored by active learning iteration, showing how the workflow systematically explores the PES through in two ways: by improving the sampling of known regions and by simultaneously exploring previously uncharted areas. Right panel: Committee disagreement versus true error (deviation from DFT) across different active learning iterations, showing strong correlation between model uncertainty and actual prediction uncertainty. We calibrate committee disagreement with linear regression against true errors, which enables quantitative uncertainty estimation in large-scale applications where reference DFT calculations are not feasible. Insets show the True Positive Rate (TPR) and the Positive Predictive Value (PPV) as functions of disagreement and true error thresholds. Both metrics approach unity along the fitted correlation line (dashed red), i.e., they are simultaneously maximized by a structure selection strategy based on calibrated committed disagreement.
  • ...and 6 more figures