Table of Contents
Fetching ...

Universal electronic manifolds for extrapolative alloy discovery

Pranoy Ray, Sayan Bhowmik, Phanish Suryanarayana, Surya R. Kalidindi, Andrew J. Medford

Abstract

This study presents a computationally efficient framework for accelerated alloy discovery that uses the non-interacting electron density to capture intrinsic structure-property relationships in refractory high-entropy alloys (HEAs). Unlike state-of-the-art approaches relying on expensive, self-consistent density functional theory calculations, our method employs the non-interacting electron density as the primary structural descriptor. By extracting physical features through directionally resolved two-point spatial correlations and compressing them via Principal Component Analysis, we efficiently map the design space. Coupling these descriptors with Bayesian active learning, we achieve a normalized mean absolute error (NMAE) of <2% for the bulk modulus of Al-Nb-Ti-Zr alloys using only 10 training samples. Furthermore, we demonstrate that the model learns an electronic packing manifold that is transferable across distinct chemical species within refractory HEAs. Validated on a distinct 7-component refractory system (Mo-Nb-Ta-Ti-V-W-Zr) containing four elements entirely absent from the training data, the framework enables rigorous zero-shot extrapolation. Moreover, by augmenting the base model with just 20 samples from the target domain, we achieve high-fidelity predictions (NMAE < 3%) for 7-component alloys, reducing data acquisition costs by orders of magnitude compared to standard workflows. These results establish the non-interacting electron density as a rigorous, extrapolative descriptor for vast compositional landscapes.

Universal electronic manifolds for extrapolative alloy discovery

Abstract

This study presents a computationally efficient framework for accelerated alloy discovery that uses the non-interacting electron density to capture intrinsic structure-property relationships in refractory high-entropy alloys (HEAs). Unlike state-of-the-art approaches relying on expensive, self-consistent density functional theory calculations, our method employs the non-interacting electron density as the primary structural descriptor. By extracting physical features through directionally resolved two-point spatial correlations and compressing them via Principal Component Analysis, we efficiently map the design space. Coupling these descriptors with Bayesian active learning, we achieve a normalized mean absolute error (NMAE) of <2% for the bulk modulus of Al-Nb-Ti-Zr alloys using only 10 training samples. Furthermore, we demonstrate that the model learns an electronic packing manifold that is transferable across distinct chemical species within refractory HEAs. Validated on a distinct 7-component refractory system (Mo-Nb-Ta-Ti-V-W-Zr) containing four elements entirely absent from the training data, the framework enables rigorous zero-shot extrapolation. Moreover, by augmenting the base model with just 20 samples from the target domain, we achieve high-fidelity predictions (NMAE < 3%) for 7-component alloys, reducing data acquisition costs by orders of magnitude compared to standard workflows. These results establish the non-interacting electron density as a rigorous, extrapolative descriptor for vast compositional landscapes.
Paper Structure (24 sections, 6 equations, 13 figures, 2 tables)

This paper contains 24 sections, 6 equations, 13 figures, 2 tables.

Figures (13)

  • Figure 1: Principal component visualization of the AlNbTiZr composition space ($\mathcal{D}_4$). The 2D projections of the first three PC scores reveal a trapezoidal arrangement where elemental compositions reside at the vertices and the alloys populate the interior, indicating that the pseudo electron density descriptors preserve the chemical and structural hierarchy of the composition space. The 3D projections are located in Supp. Figure \ref{['fig:manifold_topology']}
  • Figure 2: Convergence of bulk modulus prediction error (for $\mathcal{D}_4$) with training set size: NMAE versus number of training samples for active and random selection strategies. The active learning strategy using pseudo-density features converges to <2% error within 10 samples, demonstrating that the non-relaxed field contains sufficient information to guide the acquisition function effectively.
  • Figure 3: The end to end Bayesian Experiment Design workflow used to train $\mathcal{D}_4$
  • Figure 4: Parity plots of predicted versus DFT-computed properties for the remaining samples in $\mathcal{D}_4$. (a) Bulk modulus predictions using a model trained on actively selected samples. (b) Alloy formation energy predictions using a distinct GPR model trained on the same 3-PC pseudo-density descriptor inputs.
  • Figure 5: (a) Extrapolative parity plot assessing the transferability of the pseudo-density descriptor. The model was trained exclusively on the 4-component Al-Nb-Ti-Zr system ($\mathcal{D}_4$) and tasked with predicting the bulk moduli of the 7-component system ($\mathcal{D}_7$) containing Mo, Ta, V, and W, (b) The histogram (inset/side) illustrates the uncertainty distribution (predicted standard deviation, $\sigma$) for the test set, indicating that the model remains well-calibrated even in the extrapolation regime.
  • ...and 8 more figures