Table of Contents
Fetching ...

Determining Atomic Structure from Spectroscopy via an Active Learning Framework

Ian Slagle, Faisal Alamgir, Victor Fung

Abstract

Determining atomic structure from spectroscopic data is central to materials science but remains restricted to a limited set of techniques and material classes, largely due to the computational cost and complexity of structural refinement. Here we introduce ActiveStructOpt, a general framework that integrates graph neural network surrogate models with active learning to efficiently determine candidate structures that reproduce target spectra with minimal computational expenditure. Benchmarking with X-ray pair distribution function data, and with the more computationally demanding simulations of X-ray absorption near-edge spectra (XANES) and extended X-ray absorption fine structure (EXAFS), demonstrate that ActiveStructOpt reliably determines structures that match closely in spectra across diverse materials classes. Under equivalent computational budgets, ActiveStructOpt outperforms existing structure determination methods. By enabling data-efficient, multi-objective structural refinement across a broad range of computable spectroscopic techniques, ActiveStructOpt provides a flexible and extensible approach to atomic structure determination in complex materials.

Determining Atomic Structure from Spectroscopy via an Active Learning Framework

Abstract

Determining atomic structure from spectroscopic data is central to materials science but remains restricted to a limited set of techniques and material classes, largely due to the computational cost and complexity of structural refinement. Here we introduce ActiveStructOpt, a general framework that integrates graph neural network surrogate models with active learning to efficiently determine candidate structures that reproduce target spectra with minimal computational expenditure. Benchmarking with X-ray pair distribution function data, and with the more computationally demanding simulations of X-ray absorption near-edge spectra (XANES) and extended X-ray absorption fine structure (EXAFS), demonstrate that ActiveStructOpt reliably determines structures that match closely in spectra across diverse materials classes. Under equivalent computational budgets, ActiveStructOpt outperforms existing structure determination methods. By enabling data-efficient, multi-objective structural refinement across a broad range of computable spectroscopic techniques, ActiveStructOpt provides a flexible and extensible approach to atomic structure determination in complex materials.
Paper Structure (30 sections, 7 figures, 1 table)

This paper contains 30 sections, 7 figures, 1 table.

Figures (7)

  • Figure 1: Schematic of the Bayesian optimization loop in ActiveStructOpt, using the structure determination of lithium nickel oxide from an X-ray pair distribution function (PDF) spectrum as an example, following the general surrogate modelling process forrester2008engineering. The blue contour plot maps the goodness-of-fit over a two-dimensional projection of the structural space, where a darker color represents a better goodness-of-fit to the target X-ray PDF (the target structure itself is marked with a red X), and the green contour plots show the same projection with goodness-of-fits predicted by the surrogate model. Black lines represent the multiple optimization traces run over the surrogate model to determine the next candidate structure.
  • Figure 2: Results for the X-ray pair distribution function benchmarks. a) Performance as measured by the the median over the test cases of the best mean squared error seen after a number of simulations. ActiveStructOpt (solid), reverse Monte Carlo (dashed), DiffPy (dotted), and Bayesian Optimization with Gaussian Processes (dashdot) methods are compared. b) Simulated X-ray PDFs of the starting (black, dashed), target (red, solid), and optimized (purple, solid) structures for the median performing case in each benchmark. Structural determination minimizes the mean squared error between the target and optimized PDFs. c) Crystal Toolkit horton2023crystal visualizations of the starting (left), target (middle), and optimized (right) structures. Legends indicate the color associated with each atomic species.
  • Figure 3: Results for the amorphous carbon EXAFS test. a) Performance of ActiveStructOpt (blue, solid) and reverse Monte Carlo (orange, dashed) as measured by the the median over the test cases of the best mean squared error seen for a given number of simulations. b) Simulated EXAFS of the starting (black, dashed), target (red, solid), and optimized (purple, solid) structures. Inset shows the same information in R-space. c) Crystal Toolkit horton2023crystal visualizations of the starting (left), target (middle), and optimized (right) structures. d) Histograms comparing the distributions of interatomic neighbor distances (left) and angular distributions between neighbors less than 2.7 angstroms separated (right). Dashed lines indicate values for the starting diamond structure.
  • Figure 4: Results for the lithium nickel oxide Jahn-Teller distortions EXAFS test. a) Performance of ActiveStructOpt (blue, solid) and reverse Monte Carlo (orange, dashed) as measured by the the median over the test cases of the best mean squared error seen in a number of simulations. b) Simulated nickel EXAFS of the starting (black, dashed), target (red, solid), and optimized (purple, solid) structures. Inset indicates the real-space visualization of the EXAFS. c) Crystal Toolkit horton2023crystal visualizations of the starting, target, and optimized structures. Dark blue bonds indicate Ni-O distances longer than 2 Å. d) Histograms of the nickel--oxygen and lithium--oxygen nearest neighbor distances. Dashed lines indicate values for the starting structure.
  • Figure 5: Results for the lithium nickel oxide Jahn-Teller distortions XANES test. a) Performance of ActiveStructOpt (blue, solid) and reverse Monte Carlo (orange, dashed) as measured by the the median over the test cases of the best mean squared error seen in a number of simulations. b) Simulated nickel XANES of the starting (black, dashed), target (red, solid), and optimized (purple, solid) structures. c) Crystal Toolkit horton2023crystal visualizations of the starting, target, and optimized structures. Dark blue bonds indicate Ni-O distances longer than 2 Å. d) Histograms of the nickel--oxygen and lithium--oxygen nearest neighbor distances. Dashed lines indicate values for the starting structure.
  • ...and 2 more figures