Table of Contents
Fetching ...

Chemical-space completeness: a new strategy for crystalline materials exploration

Fengyu Xie, Ruoyu Wang, Taoyuze Lv, Yuxiang Gao, Hongyu Wu, Zhicheng Zhong

TL;DR

This work introduces a chemical-system–centric framework for crystalline materials exploration that concentrates search within bounded chemical spaces and uses a closed-loop cycle of structure generation, fast energy evaluation with neural force fields, and targeted DFT refinement. In the Li–P–S system, the approach achieves rapid convergence toward chemical-space completeness, attaining meV-scale accuracy with modest DFT data and saturating local bonding environments after early iterations. It autonomously rediscovered known motifs and generated chemically plausible, new P–S anionic units, enabling downstream phase diagrams, ionic-conductivity screening, and electronic-structure predictions via integrated ML-based electronic-density models. The resulting data-efficient, end-to-end pipeline bridges atomistic and electronic structure within defined chemical spaces, offering a scalable route toward AI-driven materials discovery with first-principles fidelity.

Abstract

The emergence of deep learning has brought the long-standing goal of comprehensively understanding and exploring crystalline materials closer to reality. Yet, universal exploration across all elements remains hindered by the combinatorial explosion of possible chemical environments, making it difficult to balance accuracy and efficiency. Crucially, within any finite set of elements, the diversity of short-range bonding types and local geometric motifs is inherently limited. Guided by this chemical intuition, we propose a chemical-system-centric strategy for crystalline materials exploration. In this framework, generative models are coupled with machine-learned force fields as fast energy evaluators, and both are iteratively refined in a closed-loop cycle of generation, evaluation, and fine-tuning. Using the Li-P-S ternary system as a case study, we show that this approach captures the diversity of local environments with minimal additional first-principles data while maintaining structural creativity, achieving closed-loop convergence toward chemical completeness within a bounded chemical space. We further demonstrate downstream applications, including phase-diagram construction, ionic-diffusivity screening, and electronic-structure prediction. Together, this strategy provides a systematic and data-efficient framework for modeling both atomistic and electronic structures within defined chemical spaces, bridging accuracy and efficiency, and paving the way toward scalable, AI-driven discovery of crystalline materials with human-level creativity and first-principles fidelity.

Chemical-space completeness: a new strategy for crystalline materials exploration

TL;DR

This work introduces a chemical-system–centric framework for crystalline materials exploration that concentrates search within bounded chemical spaces and uses a closed-loop cycle of structure generation, fast energy evaluation with neural force fields, and targeted DFT refinement. In the Li–P–S system, the approach achieves rapid convergence toward chemical-space completeness, attaining meV-scale accuracy with modest DFT data and saturating local bonding environments after early iterations. It autonomously rediscovered known motifs and generated chemically plausible, new P–S anionic units, enabling downstream phase diagrams, ionic-conductivity screening, and electronic-structure predictions via integrated ML-based electronic-density models. The resulting data-efficient, end-to-end pipeline bridges atomistic and electronic structure within defined chemical spaces, offering a scalable route toward AI-driven materials discovery with first-principles fidelity.

Abstract

The emergence of deep learning has brought the long-standing goal of comprehensively understanding and exploring crystalline materials closer to reality. Yet, universal exploration across all elements remains hindered by the combinatorial explosion of possible chemical environments, making it difficult to balance accuracy and efficiency. Crucially, within any finite set of elements, the diversity of short-range bonding types and local geometric motifs is inherently limited. Guided by this chemical intuition, we propose a chemical-system-centric strategy for crystalline materials exploration. In this framework, generative models are coupled with machine-learned force fields as fast energy evaluators, and both are iteratively refined in a closed-loop cycle of generation, evaluation, and fine-tuning. Using the Li-P-S ternary system as a case study, we show that this approach captures the diversity of local environments with minimal additional first-principles data while maintaining structural creativity, achieving closed-loop convergence toward chemical completeness within a bounded chemical space. We further demonstrate downstream applications, including phase-diagram construction, ionic-diffusivity screening, and electronic-structure prediction. Together, this strategy provides a systematic and data-efficient framework for modeling both atomistic and electronic structures within defined chemical spaces, bridging accuracy and efficiency, and paving the way toward scalable, AI-driven discovery of crystalline materials with human-level creativity and first-principles fidelity.

Paper Structure

This paper contains 14 sections, 2 equations, 7 figures, 1 table.

Figures (7)

  • Figure 1: Schematic of the chemical-system-centeric iterative framework for crystal structure exploration. Within a selected chemical space (for example, the Li–P–S system), deep generative models are employed to create candidate crystal structures, while machine-learned force fields (MLFFs) provide fast and accurate evaluation of their thermodynamic stability. Both models are iteratively fine-tuned on a small subset of DFT-evaluated structures, forming a closed-loop workflow that continuously refines generation and evaluation quality. The resulting candidate structures can be further coupled with downstream computations to build databases of atomistic and electronic properties across the explored chemical space.
  • Figure 2: (a) Density distribution of generated structures over $E_{\mathrm{hull}}$ within the Li–P–S chemical space. (b–c) Zero-temperature phase diagrams including all Li–P–S phases from (b) the Materials Project (MP) database and (c) the MP database combined with all iteratively generated S.U.N. structures. Green circles denote stable phases on the convex hull, while the blue hexagon marks the newly discovered Li$_2$PS$_3$ stable phase absent from MP. Yellow-to-red dots indicate metastable structures with $E_{\mathrm{hull}} \leq 0.1$ eV/atom, where darker red corresponds to higher $E{\mathrm{hull}}$ and thus lower stability. (d) Ratios of stable (green), unique (blue), novel (cyan), and total S.U.N. (red) structures as a function of iteration number. (e) Information entropy of local atomic features (red dots) and the total number of DFT calculations (cyan triangles) versus iteration number. Iteration 0 corresponds to Li–P–S structures originally included in the MP database. (f) Crystal structure of the newly generated stable Li$_2$PS$_3$ phase, highlighted as a blue hexagon in (c).
  • Figure 3: Fine-tuning performance of machine-learned force fields in the Li–P–S chemical system. (a) Logarithm of the information-entropy increase rate ($\mathrm{log}_{10}(dH/dN)$, upper panel), tested energy RMSE (middle panel), and force RMSE (lower panel) as functions of the number of training frames ($N_{\mathrm{train}}$).(b–c) Parity plots comparing DFT-calculated and DPA-3-predicted (b) energies and (c) forces on the test set. The DPA-3 model shown here is fine-tuned using $N_{\mathrm{train}}=4050$.
  • Figure 4: Generative diversity of P–S anionic motifs in the Li–P–S chemical space. (a) Distribution of the first two t-SNE components of the local structural fingerprints of all generated structures (excluding Li atoms). Colored clusters highlight groups of structures containing representative anion motifs as shown in b-m. (b-m) Representative thiophosphate anions discovered in the iterative process. Purple and yellow spheres represent P and S atoms, respectively. (b) PS$_4^{3-}$ tetrahedron anion. (c) P$_2$S$_6^{2-}$ edge-sharing anion. (d) P$_2$S$_7^{4-}$ corner-sharing anion. (e) P$_3$S$_9^{3-}$ corner-sharing ring anion. (f) (PS$_3$)$_n^{n-}$ polymeric chain. (g) P$_2$S$_6^{4-}$ dimer with P-P bond. (h) PS$_3^{2-}$ triangular monomer. (i) P$_3$S$_8^{3-}$ ring anion. (j) 1,4-P$_2$S$_8^{2-}$, with two P atoms occupying para positions (1,4-substitution) within a six-membered P$_2$S$_4$ ring, lying opposite to each other across two S-S bridges. (k) 1,3-P$_2$S$_8^{2-}$, with two P atoms occupying meta positions (1,3-substitution) within a six-membered P$_2$S$_4$ ring. (l) 1,3-P$_2$S$_7^{2-}$, featuring two P atoms in meta positions within a five-membered P$_2$S$_3$ ring. (m) (PS$_4$)$_n^{n-}$ polymeric chain linked by S-S bonds.
  • Figure 5: Finite temperature-pressure phase diagrams in Li-P-S chemical space at varied temperatures and pressures. (a) Phase diagram at T = 300 K, P = 1 bar. (b) Phase diagram at T = 600 K, P = 1 bar. (c) Phase diagram at T=300 K, P=2 GPa. Green dots represent stable phases on the convex hull, while triangles represent metastable phases with $E_{\mathrm{hull}} < 0.1$ eV/atom, and darker red color indicates lower stability.
  • ...and 2 more figures