Electronic structure prediction of medium and high entropy alloys across composition space
Shashank Pathrudkar, Stephanie Taylor, Abhishek Keripale, Abhijeet Sadashiv Gangan, Ponkrshnan Thiagarajan, Shivang Agarwal, Jaime Marian, Susanta Ghosh, Amartya S. Banerjee
TL;DR
This work tackles the expensive evaluation of electronic structure across composition space in medium and high entropy alloys by proposing a data-efficient ML workflow that predicts the ground-state electron density $\rho$ and derived energies. It combines Bayesian Active Learning with novel body-attached-frame descriptors and a separate $\delta ho$ model to improve accuracy, enabling reliable predictions across binary, ternary, and quaternary alloys (and even quinary extensions). The approach demonstrates strong generalization to unseen compositions, defects, and segregation patterns, while reducing training data needs by factors up to 2.5 (ternary) and 1.7 (quaternary) compared to tessellation-based sampling, achieving chemical-accuracy-level energies in many cases. This framework substantially accelerates exploration of composition space for HEAs/MEAs and can be extended to other bulk materials and low-dimensional systems, providing a practical route to rapid materials discovery from first principles.
Abstract
We propose machine learning (ML) models to predict the electron density -- the fundamental unknown of a material's ground state -- across the composition space of concentrated alloys. From this, other physical properties can be inferred, enabling accelerated exploration. A significant challenge is that the number of sampled compositions and descriptors required to accurately predict fields like the electron density increases rapidly with species. To address this, we employ Bayesian Active Learning (AL), which minimizes training data requirements by leveraging uncertainty quantification capabilities of Bayesian Neural Networks. Compared to strategic tessellation of the composition space, Bayesian-AL reduces the number of training data points by a factor of 2.5 for ternary (SiGeSn) and 1.7 for quaternary (CrFeCoNi) systems. We also introduce easy-to-optimize, body-attached-frame descriptors, which respect physical symmetries and maintain approximately the same descriptor-vector size as alloy elements increase. Our ML models demonstrate high accuracy and generalizability in predicting both electron density and energy across composition space.
