Table of Contents
Fetching ...

Accelerating superconductor discovery through tempered deep learning of the electron-phonon spectral function

Jason B. Gibson, Ajinkya C. Hire, Philip M. Dee, Oscar Barrera, Benjamin Geisler, Peter J. Hirschfeld, Richard G. Hennig

TL;DR

This work tackles the data-limited challenge of discovering new electron-phonon superconductors by predicting the electron-phonon spectral function $α^2F(ω)$ with a physics-informed deep learning approach. The authors introduce BETE-NET, a Bootstrapped Ensemble of Tempered Equivariant graph networks, which integrates site-projected phonon density of states as inductive bias and leverages the double descent phenomenon to temper overfitting, enabling accurate predictions of Eliashberg moments $λ$, $ω_{log}$, and $ω_{2}$ as well as the critical temperature $T_c$ from a small dataset of 818 materials. The model achieves MAEs of 0.18 for $λ$, 29 K for $ω_{log}$, and 28 K for $ω_{2}$ (with PhDOS bias) and a Tc MAE of 2.1 K, with CSO/CPD/FPD variants demonstrating strong data efficiency and high-throughput screening capability (AP ≈ 5× random). A practical screening strategy shows how surrogate models can triage candidates for expensive DFT calculations, identifying both high-Tc predictions and experimentally promising materials, thereby accelerating superconductivity discovery. The approach highlights how physics-informed ML can enable robust predictions from limited data and sets a precedent for applying such methods to other derived properties in materials science.

Abstract

Integrating deep learning with the search for new electron-phonon superconductors represents a burgeoning field of research, where the primary challenge lies in the computational intensity of calculating the electron-phonon spectral function, $α^2F(ω)$, the essential ingredient of Midgal-Eliashberg theory of superconductivity. To overcome this challenge, we adopt a two-step approach. First, we compute $α^2F(ω)$ for 818 dynamically stable materials. We then train a deep-learning model to predict $α^2F(ω)$, using an unconventional training strategy to temper the model's overfitting, enhancing predictions. Specifically, we train a Bootstrapped Ensemble of Tempered Equivariant graph neural NETworks (BETE-NET), obtaining an MAE of 0.21, 45 K, and 43 K for the Eliashberg moments derived from $α^2F(ω)$: $λ$, $ω_{\log}$, and $ω_{2}$, respectively, yielding an MAE of 2.5 K for the critical temperature, $T_c$. Further, we incorporate domain knowledge of the site-projected phonon density of states to impose inductive bias into the model's node attributes and enhance predictions. This methodological innovation decreases the MAE to 0.18, 29 K, and 28 K, respectively, yielding an MAE of 2.1 K for $T_c$. We illustrate the practical application of our model in high-throughput screening for high-$T_c$ materials. The model demonstrates an average precision nearly five times higher than random screening, highlighting the potential of ML in accelerating superconductor discovery. BETE-NET accelerates the search for high-$T_c$ superconductors while setting a precedent for applying ML in materials discovery, particularly when data is limited.

Accelerating superconductor discovery through tempered deep learning of the electron-phonon spectral function

TL;DR

This work tackles the data-limited challenge of discovering new electron-phonon superconductors by predicting the electron-phonon spectral function with a physics-informed deep learning approach. The authors introduce BETE-NET, a Bootstrapped Ensemble of Tempered Equivariant graph networks, which integrates site-projected phonon density of states as inductive bias and leverages the double descent phenomenon to temper overfitting, enabling accurate predictions of Eliashberg moments , , and as well as the critical temperature from a small dataset of 818 materials. The model achieves MAEs of 0.18 for , 29 K for , and 28 K for (with PhDOS bias) and a Tc MAE of 2.1 K, with CSO/CPD/FPD variants demonstrating strong data efficiency and high-throughput screening capability (AP ≈ 5× random). A practical screening strategy shows how surrogate models can triage candidates for expensive DFT calculations, identifying both high-Tc predictions and experimentally promising materials, thereby accelerating superconductivity discovery. The approach highlights how physics-informed ML can enable robust predictions from limited data and sets a precedent for applying such methods to other derived properties in materials science.

Abstract

Integrating deep learning with the search for new electron-phonon superconductors represents a burgeoning field of research, where the primary challenge lies in the computational intensity of calculating the electron-phonon spectral function, , the essential ingredient of Midgal-Eliashberg theory of superconductivity. To overcome this challenge, we adopt a two-step approach. First, we compute for 818 dynamically stable materials. We then train a deep-learning model to predict , using an unconventional training strategy to temper the model's overfitting, enhancing predictions. Specifically, we train a Bootstrapped Ensemble of Tempered Equivariant graph neural NETworks (BETE-NET), obtaining an MAE of 0.21, 45 K, and 43 K for the Eliashberg moments derived from : , , and , respectively, yielding an MAE of 2.5 K for the critical temperature, . Further, we incorporate domain knowledge of the site-projected phonon density of states to impose inductive bias into the model's node attributes and enhance predictions. This methodological innovation decreases the MAE to 0.18, 29 K, and 28 K, respectively, yielding an MAE of 2.1 K for . We illustrate the practical application of our model in high-throughput screening for high- materials. The model demonstrates an average precision nearly five times higher than random screening, highlighting the potential of ML in accelerating superconductor discovery. BETE-NET accelerates the search for high- superconductors while setting a precedent for applying ML in materials discovery, particularly when data is limited.
Paper Structure (13 sections, 5 figures, 1 table)

This paper contains 13 sections, 5 figures, 1 table.

Figures (5)

  • Figure 1: BETE-NET architecture for predicting the Eliashberg function. First, the model converts a crystal structure into a graph with nodes representing atoms. The nodes are embedded with a one-hot-encoded vector of the constituent atoms' atomic number multiplied by the atomic mass. The edges, connecting atoms within a 4 $\text{\r{A}}$ radius, embed the interatomic distance. A sequence of convolutions and gated-block operations are then applied to the graph. Finally, a pooling operation is applied to the node embedding, yielding the prediction of $\alpha^2F(\omega)$. This is the base network for the crystal-structure-only (CSO) variant of BETE-NET. A bootstrapped ensemble of 100 of these models is then trained to produce the final model. The model's prediction is vastly improved by including information on the site-projected PhDOS. To enable this, we add a decision block to determine if the site-projected PhDOS is to be appended to the initial node embedding. There are two possible embeddings: the coarse PhDOS (CPD) model variant embeds coarse site-projected PhDOS (force-constant calculated on $2\times2\times2$$\mathbf{q}$-grid and Fourier interpolated to $20\times20\times20$) into the nodes. The fine PhDOS (FPD) model variant is the same as CPD but with a PhDOS embedding calculated on a fine $\mathbf{q}$-grid and Fourier interpolated to $20\times20\times20$.
  • Figure 2: Correlation and histogram plots of the dataset.(a) Electron-phonon coupling constant, $\lambda$, and electronic density of states, eDOS, at the Fermi energy, (b)$\alpha^2 F$ and PhDOS derived $\omega_{\text{log}}$, and (c)$\alpha^2 F$ and PhDOS derived $\omega_2$. The histograms on the right-hand side show the distribution of the electron-phonon coupling constant and the two moments of $\frac{2}{\lambda\omega}\alpha^2F(\omega)$ -- logarithmic moment $\omega_{\text{log}}$ and second moment $\omega_2$ -- in our dataset. The electronic DOS is averaged over a window of $\pm$50 meV at the Fermi energy. These correlation plots will guide the fit of a baseline model against which the deep learning models will be compared.
  • Figure 3: Comparison of the classical, critical, and modern training regimes for a select model.(a) The smoothed learning curve for the training (blue curve) and validation (orange curve) as a function of training epochs. The thickness of the line corresponds to small fluctuations in the loss. The insets show the loss landscape for the classical (1st minima), critical (maximum), and modern (2nd minima) regimes, illustrating why generalization improves for the second descent. The x and y-axes of the loss landscape are the magnitude of perturbation for each orthogonal direction, and the z-axis is the magnitude of loss. (b, c) The model trained in the classical regime's validation parity plots of $\lambda$, $\omega_{\log}$, and $\omega_{2}$, respectively. (d, e) The model trained in the critical regime's validation parity plots of $\lambda$, $\omega_{\log}$, and $\omega_{2}$, respectively. (f, g) The model trained in the modern regime's validation parity plots of $\lambda$, $\omega_{\log}$, and $\omega_{2}$, respectively.
  • Figure 4: Test results for the three final variants of BETE-NET.(a-i) Testing parity plots of $\lambda$, $\omega_{\log}$, and $\omega_{2}$, respectively, for the models. In comparing the parity plots across models, there is a systematic improvement in the derived properties, highlighting the advantage of embedding physically relevant properties. (j) the six best predicted $\alpha^2F(\omega)$. Here, we clearly see that embedding PhDOS almost perfectly corrects the predictions of the CSO model. (k) six worst predicted $\alpha^2F(\omega)$. (l) The average prediction errors $\bar{\Delta}$ of materials containing each element.
  • Figure 5: Results of screening the test set for the 33 superconductors with $T_{\mathrm{c}}^{\mathrm{DFT}}>5$ K. The precision-recall curve for each model. Precision refers to the fraction of materials correctly predicted by our model to have $T_{\mathrm{c}}^{\mathrm{DFT}}>5$ K compared to all predictions that meet this criterion. Recall refers to the portion of material correctly identified to have $T_{\mathrm{c}}^{\mathrm{DFT}}>5$ K compared to all materials that meet this criterion. The color of the line represents the $T_{\mathrm{c}}^{\mathrm{ML}}$ criterion. The title signifies the model, and the number in parentheses is the average precision (AP). The solid black marker signifies our suggested criterion for using the models as a surrogate model in high-throughput screening; the inset shows the confusion matrix for this criterion. The x-axis of the confusion matrix is the predicted label, the y-axis is the true label, and the numbers in parentheses show a comparison with a random classifier. Compared to a random classifier, which would obtain an AP of 0.2 on our test set, all our models have better performance, with our best models (CPD and FPD) obtaining an AP nearly five times that of a random classifier. Our models perform better than the base model, which requires information on the PhDOS and eDOS.