Machine Learning for Improved Density Functional Theory Thermodynamics
Sergei I. Simak, Erna K. Delczeg-Czirjak, Olle Eriksson
TL;DR
This work tackles the limited energy resolution of density functional theory (DFT) in predicting alloy phase stability by introducing Error Corrected DFT (EC-DFT), a machine-learning framework that learns the correction $H_{ ext{corr}} = H_{ ext{DFT}} - H_{ ext{expt}}$ to formation enthalpies. A structured feature set (concentrations, weighted atomic numbers, and interaction terms) feeds both a linear model and a multi-layer perceptron (MLP) regressor, with the latter delivering substantially lower RMSEs via cross-validation on binary and ternary systems. Applied to the Al-Ni-Pd and Al-Ni-Ti ternaries, the neural network achieves RMSEs as low as ~5.5 meV/atom overall and ~10 meV/atom on unseen compositions, significantly improving DFT-based formation enthalpy predictions. The EC-DFT framework provides a scalable, interpretable route to more reliable phase-diagram predictions, aiding accelerated computational materials design while highlighting potential experimental or model limitations in specific cases.
Abstract
The predictive accuracy of density functional theory (DFT) for alloy formation enthalpies is often limited by intrinsic energy resolution errors, particularly in ternary phase stability calculations. In this work, we present a machine learning (ML) approach to systematically correct these errors, improving the reliability of first-principles predictions. A neural network model has been trained to predict the discrepancy between DFT-calculated and experimentally measured enthalpies for binary and ternary alloys and compounds. The model utilizes a structured feature set comprising elemental concentrations, atomic numbers, and interaction terms to capture key chemical and structural effects. By applying supervised learning and rigorous data curation we ensure a robust and physically meaningful correction. The model is implemented as a multi-layer perceptron (MLP) regressor with three hidden layers, optimized through leave-one-out cross-validation (LOOCV) and k-fold cross-validation to prevent overfitting. We illustrate the effectiveness of this method by applying it to the Al-Ni-Pd and Al-Ni-Ti systems, which are of interest for high-temperature applications in aerospace and protective coatings.
