Interpretable Multimodal Machine Learning Analysis of X-ray Absorption Near-Edge Spectra and Pair Distribution Functions
Tanaporn Na Narong, Zoe N. Zachko, Steven B. Torrisi, Simon J. L. Billinge
TL;DR
This work demonstrates that interpretable random-forest models can fuse XANES and PDF data to characterize local environments around transition-metal cations in oxides. Across four metals (Ti, Mn, Fe, Cu), XANES generally provides stronger, site-specific structural information than total PDFs, while differential PDFs offer complementary signals that enhance certain predictions. Multimodal models often yield modest improvements, with notable gains when using dPDFs, especially for bond-length estimation and Fe-related predictions, highlighting the value of species-specific signals. The results provide actionable guidance for experimental design and data fusion strategies, illustrating when combining XANES and PDF modalities adds meaningful information to structural investigations.
Abstract
We used interpretable machine learning to combine information from multiple heterogeneous spectra: X-ray absorption near-edge spectra (XANES) and atomic pair distribution functions (PDFs) to extract local structural and chemical environments of transition metal cations in oxides. Random forest models were trained on simulated XANES, PDF, and both combined to extract oxidation state, coordination number, and mean nearest-neighbor bond length. XANES-only models generally outperformed PDF-only models, even for structural tasks, although using the metal's differential PDFs (dPDFs) instead of total PDFs narrowed this gap. When combined with PDFs, information from XANES often dominates the prediction. Our results demonstrate that XANES contain rich structural information and highlight the utility of species-specificity. This interpretable, multimodal approach is quick to implement with suitable databases and offers valuable insights into the relative strengths of different modalities, guiding researchers in experiment design and identifying when combining complementary techniques adds meaningful information to a scientific investigation.
