Table of Contents
Fetching ...

Ab Initio Structure Solutions from Nanocrystalline Powder Diffraction Data

Gabe Guo, Tristan Saidi, Maxwell Terban, Michele Valsecchi, Simon JL Billinge, Hod Lipson

Abstract

A major challenge in materials science is the determination of the structure of nanometer sized objects. Here we present a novel approach that uses a generative machine learning model based on diffusion processes that is trained on 45,229 known structures. The model factors both the measured diffraction pattern as well as relevant statistical priors on the unit cell of atomic cluster structures. Conditioned only on the chemical formula and the information-scarce finite-size broadened powder diffraction pattern, we find that our model, PXRDnet, can successfully solve simulated nanocrystals as small as 10 angstroms across 200 materials of varying symmetry and complexity, including structures from all seven crystal systems. We show that our model can successfully and verifiably determine structural candidates four out of five times, with average error among these candidates being only 7% (as measured by post-Rietveld refinement R-factor). Furthermore, PXRDnet is capable of solving structures from noisy diffraction patterns gathered in real-world experiments. We suggest that data driven approaches, bootstrapped from theoretical simulation, will ultimately provide a path towards determining the structure of previously unsolved nano-materials.

Ab Initio Structure Solutions from Nanocrystalline Powder Diffraction Data

Abstract

A major challenge in materials science is the determination of the structure of nanometer sized objects. Here we present a novel approach that uses a generative machine learning model based on diffusion processes that is trained on 45,229 known structures. The model factors both the measured diffraction pattern as well as relevant statistical priors on the unit cell of atomic cluster structures. Conditioned only on the chemical formula and the information-scarce finite-size broadened powder diffraction pattern, we find that our model, PXRDnet, can successfully solve simulated nanocrystals as small as 10 angstroms across 200 materials of varying symmetry and complexity, including structures from all seven crystal systems. We show that our model can successfully and verifiably determine structural candidates four out of five times, with average error among these candidates being only 7% (as measured by post-Rietveld refinement R-factor). Furthermore, PXRDnet is capable of solving structures from noisy diffraction patterns gathered in real-world experiments. We suggest that data driven approaches, bootstrapped from theoretical simulation, will ultimately provide a path towards determining the structure of previously unsolved nano-materials.
Paper Structure (36 sections, 5 equations, 10 figures, 2 tables)

This paper contains 36 sections, 5 equations, 10 figures, 2 tables.

Figures (10)

  • Figure 1: Nanomaterial PXRD Patterns: We simulate nanoscale shrinkage via the sinc$^2$ filter, thereby broadening the PXRD peaks (purple lines) from the ideal pattern (gray lines). To improve model performance, we create a smoother target PXRD pattern (dotted pink lines) that removes the sharp ripples at the expense of further broadening the diffraction pattern. This is done via an additional Gaussian filter after the sinc filter. The horizontal axis is $Q$ (Å$^{-1}$), and the vertical axis is scaled intensity (where $1$ is maximal).
  • Figure 2: Histogram (Distribution) of $\mathbf{R_{wp}^{2}}$: Corresponds to $R_{wp}^{2}$ results shown in Table \ref{['tab:results']}. The horizontal axis is $R_{wp}^{2}$ (the R-factor), i.e., the relative error between the predicted material's PXRD pattern and the ground truth material's PXRD pattern. The vertical axis is the number of materials from the testing set that had $R_{wp}^{2}$ at a certain level. The green line with stars represents PXRDnet candidates, and the red line with dots represents CDVAE-Search candidates. Panel A is from PXRD patterns from 10 Å nanomaterials; Panel B is from PXRD patterns from 100 Å nanomaterials.
  • Figure 3: PXRDnet Structure Predictions: The leftmost column is the ground truth crystal structure. The other columns show PXRDnet's reconstructed crystal structures (after Rietveld refinement) from simulated nanocrystal PXRD patterns of diameter 10, 100 Å. For convenient visualization of some examples, we have tessellated their unit cells into super cells and translated the origin of the cif files. These structures have been uniformly selected from the testing dataset. Thus, the spread of success and failure across these figures is roughly reflective of success and failure across the whole testing dataset.
  • Figure 4: PXRD Comparisons: Comparison (without finite size effects) of the ground truth PXRD pattern, the raw PXRDnet prediction's PXRD pattern, and the Rietveld-refined PXRD pattern. Corresponds to results in Figure \ref{['fig:visualizations']}.
  • Figure 5: Rietveld Refinement Results: We conducted Rietveld refinement on ten promising candidates each for $20$ uniformly selected materials, making for a total of $200$ structures refined at each nanocrystal size. Panel A shows the results for 10 Å nanocrystal size; Panel B shows the results for 100 Å nanocrystal size. We display the results corresponding to the best post-refinement $R_{wp}^{2}$ (which can be calculated without knowledge of the ground truth structure). The horizontal axis is the calculated $R_{wp}^{2}$ before refinement; the vertical axis is the calculated $R_{wp}^{2}$ after refinement. For 10 Å nanocrystal size, the pre-refinement $R_{wp}^{2}$ has $\mu \pm \sigma = 0.561 \pm 0.388$, and the post-refinement $R_{wp}^{2}$ has $\mu \pm \sigma = 0.132 \pm 0.166$. For 100 Å nanocrystal size, the pre-refinement $R_{wp}^{2}$ has $\mu \pm \sigma = 0.663 \pm 0.473$, and post-refinement $R_{wp}^{2}$ has $\mu \pm \sigma = 0.068 \pm 0.066$.
  • ...and 5 more figures