Table of Contents
Fetching ...

Kernel Learning Assisted Synthesis Condition Exploration for Ternary Spinel

Yutong Liu, Mehrad Ansari, Robert Black, Jason Hattrick-Simpers

TL;DR

This work tackles the challenge of predicting synthesizability of the single-phase $Fe_{2}(ZnCo)O_{4}$ spinel under a high-throughput co-precipitation workflow. It introduces a kernel-based classifier paired with global SHapley Additive exPlanations (SHAP) to interpret how synthesis conditions influence single-phase formation, even with a small, imbalanced dataset. The results show that reagent amounts, especially the total metal amount, and $K_2CO_3$ concentration critically govern phase outcome, with a distinct missing region in $K_2CO_3$ concentration and a consistency with crystal growth theory (BCF). Collectively, the approach provides a data-informed route to design practical synthesis protocols for complicated MMOs and demonstrates a framework for interpretable synthesis planning in inorganic materials.

Abstract

Machine learning and high-throughput experimentation have greatly accelerated the discovery of mixed metal oxide catalysts by leveraging their compositional flexibility. However, the lack of established synthesis routes for solid-state materials remains a significant challenge in inorganic chemistry. An interpretable machine learning model is therefore essential, as it provides insights into the key factors governing phase formation. Here, we focus on the formation of single-phase Fe$_2$(ZnCo)O$_4$, synthesized via a high-throughput co-precipitation method. We combined a kernel classification model with a novel application of global SHAP analysis to pinpoint the experimental features most critical to single phase synthesizability by interpreting the contributions of each feature. Global SHAP analysis reveals that precursor and precipitating agent contributions to single-phase spinel formation align closely with established crystal growth theories. These results not only underscore the importance of interpretable machine learning in refining synthesis protocols but also establish a framework for data-informed experimental design in inorganic synthesis.

Kernel Learning Assisted Synthesis Condition Exploration for Ternary Spinel

TL;DR

This work tackles the challenge of predicting synthesizability of the single-phase spinel under a high-throughput co-precipitation workflow. It introduces a kernel-based classifier paired with global SHapley Additive exPlanations (SHAP) to interpret how synthesis conditions influence single-phase formation, even with a small, imbalanced dataset. The results show that reagent amounts, especially the total metal amount, and concentration critically govern phase outcome, with a distinct missing region in concentration and a consistency with crystal growth theory (BCF). Collectively, the approach provides a data-informed route to design practical synthesis protocols for complicated MMOs and demonstrates a framework for interpretable synthesis planning in inorganic materials.

Abstract

Machine learning and high-throughput experimentation have greatly accelerated the discovery of mixed metal oxide catalysts by leveraging their compositional flexibility. However, the lack of established synthesis routes for solid-state materials remains a significant challenge in inorganic chemistry. An interpretable machine learning model is therefore essential, as it provides insights into the key factors governing phase formation. Here, we focus on the formation of single-phase Fe(ZnCo)O, synthesized via a high-throughput co-precipitation method. We combined a kernel classification model with a novel application of global SHAP analysis to pinpoint the experimental features most critical to single phase synthesizability by interpreting the contributions of each feature. Global SHAP analysis reveals that precursor and precipitating agent contributions to single-phase spinel formation align closely with established crystal growth theories. These results not only underscore the importance of interpretable machine learning in refining synthesis protocols but also establish a framework for data-informed experimental design in inorganic synthesis.

Paper Structure

This paper contains 15 sections, 6 equations, 5 figures, 2 tables.

Figures (5)

  • Figure 1: A summary of the overall workflow from experiment to analysis. Fe$_2$(ZnCo)O$_4$ spinel samples are prepared by HT-synthesis using co-precipitation method on Chemspeed automated platform with a series of post-treatment. The final product is suspended in ink for HT-XRD measurement. The XRD results in single-phase classification are used as training data for the synthesizability model.
  • Figure 2: Ground-truth distributions for single-phase versus multiple phase of five key experimental parameters in Fe$_2$(ZnCo)O$_4$ spinel samples generated via Chemspeed automated platform. Phase labels are determined via high-throughput XRD. (A) metal precursor concentration, (B) adding rate, (C) metal amount, (D) K$_2$CO$_3$ concentration, and (E) precipitation order. The total number of experiment samples is 70, with only 17 resulting in the desired single-phase solution.
  • Figure 3: Kernel learning is used in the ternary spinel synthesizability classification model. A) Absolute correlation matrix of the selected experimental features using Pearson’s correlation coefficient. Other features with correlation $>=0.55$ are considered highly correlated, thus, disregarded. B) Confusion matrix for the binary classification of the solution's phase. The leave-one-out cross-validation (LOOCV) accuracy and AUC are, 0.843 and 0.836, respectively. C) The calibration curve shows that the model's predicted probabilities align with observed frequencies of single-phase outcomes, with some natural zigzagging due to sparse bin populations in LOOCV.)
  • Figure 4: Contour plots of the global SHAP values for different pairs of experimental features, aggregated over the synthetic space of 43,000 experiment samples. The phase of the binary spinel is inferred by the kernel classifier, and the global SHAP value offers a comprehensive view of how each feature combination positively influences single-phase formation in alignment with theoretical and experimental expectations. Within the K$_2$CO$_3$ concentration range of 0.15--0.3 M, all samples exhibit at least one secondary phase, creating a missing region (marked with red) with very few single-phase outcomes (panel C). Despite additional experiments confirming single-phase synthesis is possible, the overall success rate in this region remains significantly lower than elsewhere, underscoring the distinct effect of K$_2$CO$_3$ on phase formation.
  • Figure 5: Kernel model calibration assessment. The high uncertainty threshold was set at the 66th percentile (0.47). A) Distribution of kernel model's disagreements (errors) in the missing region by uncertainty level and class. B) Probability vs uncertainty of all samples in the missing region. 100% of the multi-phase errors (FP) occur in the high uncertainty region, while most correct predictions (TP/TN) cluster in the low uncertainty regions with higher confidence (probabilities further from 0.5), demonstrating appropriate model calibration.