OmniXAS: A Universal Deep-Learning Framework for Materials X-ray Absorption Spectra

Shubha R. Kharel; Fanchen Meng; Xiaohui Qu; Matthew R. Carbone; Deyu Lu

OmniXAS: A Universal Deep-Learning Framework for Materials X-ray Absorption Spectra

Shubha R. Kharel, Fanchen Meng, Xiaohui Qu, Matthew R. Carbone, Deyu Lu

TL;DR

OmniXAS presents a universal deep-learning framework for predicting K-edge XANES spectra across eight 3d transition metals by combining M3GNet-derived transfer-features with a cascaded transfer-learning scheme. The approach uses a universal XAS model trained on diverse elements and fine-tunes it to element-specific data, with additional cross-fidelity transfer from fast FEFF-like simulations to more expensive VASP calculations. Key contributions include superior transfer-feature representations, a robust universal-to-element-specific transfer mechanism, and dramatic computational speedups enabling high-throughput and real-time XAS analysis. The framework is positioned to generalize beyond XAS to other material-property predictions, providing a scalable blueprint for transfer learning in materials science.

Abstract

X-ray absorption spectroscopy (XAS) is a powerful characterization technique for probing the local chemical environment of absorbing atoms. However, analyzing XAS data presents significant challenges, often requiring extensive, computationally intensive simulations, as well as significant domain expertise. These limitations hinder the development of fast, robust XAS analysis pipelines that are essential in high-throughput studies and for autonomous experimentation. We address these challenges with OmniXAS, a framework that contains a suite of transfer learning approaches for XAS prediction, each contributing to improved accuracy and efficiency, as demonstrated on K-edge spectra database covering eight 3d transition metals (Ti-Cu). The OmniXAS framework is built upon three distinct strategies. First, we use M3GNet to derive latent representations of the local chemical environment of absorption sites as input for XAS prediction, achieving up to order-of-magnitude improvements over conventional featurization techniques. Second, we employ a hierarchical transfer learning strategy, training a universal multi-task model across elements before fine-tuning for element-specific predictions. Models based on this cascaded approach after element-wise fine-tuning outperform element-specific models by up to 69%. Third, we implement cross-fidelity transfer learning, adapting a universal model to predict spectra generated by simulation of a different fidelity with a higher computational cost. This approach improves prediction accuracy by up to 11% over models trained on the target fidelity alone. Our approach boosts the throughput of XAS modeling by orders of magnitude versus first-principles simulations and is extendable to XAS prediction for a broader range of elements. This transfer learning framework is generalizable to enhance deep-learning models that target other properties in materials research.

OmniXAS: A Universal Deep-Learning Framework for Materials X-ray Absorption Spectra

TL;DR

Abstract

Paper Structure (43 sections, 22 equations, 14 figures, 1 table)

This paper contains 43 sections, 22 equations, 14 figures, 1 table.

Introduction
Method
Data Acquisition and Curation
Input File Generation
Removal of Unphysical Spectra and Outliers
Rescaling and Edge Alignment
XAS Prediction Hypotheses
Transfer Learning
Inductive Transfer Learning via Feature-Transfer
Domain Adaptation via Fine-tuning
XASModels
M3GNet block
Transfer-Features
XAS block
XASModel Training
...and 28 more sections

Figures (14)

Figure 1: Schematic of the OmniXAS framework. a) Data Curation: Structural data are sourced from the Materials Project, from which FEFF and VASP spectral input files are generated locally using Lightshow. FEFF and VASP spectral simulations are performed using these input files, and the results are screened and processed into machine learning-ready data. b) XAS Model: Materials structure information is processed through frozen M3GNet blocks using a series of graph convolutions with three-body interaction updates. From these operations, only the latent state at the node of the absorption site is passed into trainable neural networks for predicting site-specific XAS spectra. c) Cascaded Transfer Learning: The workflow diverges into two paths for training the XAS-block, resulting in three variants of XAS models. In one path, individual models for each data subset (ExpertXAS) are trained. In the other, a single model that predicts FEFF spectra for all elements (UniversalXAS) is developed. Further along, knowledge transfer from the UniversalXAS model is applied through fine-tuning, producing another set of specialized models for each subset that we call Tuned-UniversalXAS.
Figure 2: Number of spectra generated for each element from FEFF and VASP simulations. The highest bars represent the total number of spectra, with partitions indicating the portions removed during different data cleaning stages (unconverged and anomalies) and the remaining ML-ready data.
Figure 3: Heatmaps of FEFF XANES spectra. Color represents the density of spectral features.
Figure 4: Heatmaps of VASP XANES spectra. Color scheme is the same as in Fig. \ref{['fig:FEFF-ML-data']}.
Figure 5: UMAP plots of M3GNet featurization (a-d) and spectra (e-h) colored by absorbing element type (a and e), oxidation-state (OS) (b and f), coordination-number (CN) (c and g), and oxygen-coordination-number (OCN) (d and h). The oxygen-coordination-number is rounded to the nearest half integer for visualization purposes. The gray points represent those in which the physical descriptor can not be determined.
...and 9 more figures

OmniXAS: A Universal Deep-Learning Framework for Materials X-ray Absorption Spectra

TL;DR

Abstract

OmniXAS: A Universal Deep-Learning Framework for Materials X-ray Absorption Spectra

Authors

TL;DR

Abstract

Table of Contents

Figures (14)