Table of Contents
Fetching ...

From Natural Language to Materials Discovery:The Materials Knowledge Navigation Agent

Genmao Zhuang, Amir Barati Farimani

TL;DR

This work tackles rapid materials discovery by presenting MKNA, a language-driven agent that unifies semantic understanding, literature grounding, data-driven prediction, structure-generation, and physics-based validation into a closed loop. It translates open-ended objectives into executable actions, derives quantitative criteria such as a Debye-temperature threshold $\Theta_D > 800$ K from literature, and builds datasets via autonomous code generation to train surrogates (e.g., CGCNN) and perform stability validation with M3GNet. In a case study on high-$\Theta_D$ materials, MKNA rediscovered canonical stiff materials and identified novel Be–C–rich candidates, demonstrating interpretable design motifs and a bias toward stiff, thermodynamically stable structures. The results suggest a generalizable platform for language-guided, autonomous materials exploration with potential for synthesis-aware extensions and closed-loop experimental feedback.

Abstract

Accelerating the discovery of high-performance materials remains a central challenge across energy, electronics, and aerospace technologies, where traditional workflows depend heavily on expert intuition and computationally expensive simulations. Here we introduce the Materials Knowledge Navigation Agent (MKNA), a language-driven system that translates natural-language scientific intent into executable actions for database retrieval, property prediction, structure generation, and stability evaluation. Beyond automating tool invocation, MKNA autonomously extracts quantitative thresholds and chemically meaningful design motifs from literature and database evidence, enabling data-grounded hypothesis formation. Applied to the search for high-Debye-temperature ceramics, the agent identifies a literature-supported screening criterion (Theta_D > 800 K), rediscovers canonical ultra-stiff materials such as diamond, SiC, SiN, and BeO, and proposes thermodynamically stable, previously unreported Be-C-rich compounds that populate the sparsely explored 1500-1700 K regime. These results demonstrate that MKNA not only finds stable candidates but also reconstructs interpretable design heuristics, establishing a generalizable platform for autonomous, language-guided materials exploration.

From Natural Language to Materials Discovery:The Materials Knowledge Navigation Agent

TL;DR

This work tackles rapid materials discovery by presenting MKNA, a language-driven agent that unifies semantic understanding, literature grounding, data-driven prediction, structure-generation, and physics-based validation into a closed loop. It translates open-ended objectives into executable actions, derives quantitative criteria such as a Debye-temperature threshold K from literature, and builds datasets via autonomous code generation to train surrogates (e.g., CGCNN) and perform stability validation with M3GNet. In a case study on high- materials, MKNA rediscovered canonical stiff materials and identified novel Be–C–rich candidates, demonstrating interpretable design motifs and a bias toward stiff, thermodynamically stable structures. The results suggest a generalizable platform for language-guided, autonomous materials exploration with potential for synthesis-aware extensions and closed-loop experimental feedback.

Abstract

Accelerating the discovery of high-performance materials remains a central challenge across energy, electronics, and aerospace technologies, where traditional workflows depend heavily on expert intuition and computationally expensive simulations. Here we introduce the Materials Knowledge Navigation Agent (MKNA), a language-driven system that translates natural-language scientific intent into executable actions for database retrieval, property prediction, structure generation, and stability evaluation. Beyond automating tool invocation, MKNA autonomously extracts quantitative thresholds and chemically meaningful design motifs from literature and database evidence, enabling data-grounded hypothesis formation. Applied to the search for high-Debye-temperature ceramics, the agent identifies a literature-supported screening criterion (Theta_D > 800 K), rediscovers canonical ultra-stiff materials such as diamond, SiC, SiN, and BeO, and proposes thermodynamically stable, previously unreported Be-C-rich compounds that populate the sparsely explored 1500-1700 K regime. These results demonstrate that MKNA not only finds stable candidates but also reconstructs interpretable design heuristics, establishing a generalizable platform for autonomous, language-guided materials exploration.
Paper Structure (19 sections, 2 equations, 6 figures, 4 tables)

This paper contains 19 sections, 2 equations, 6 figures, 4 tables.

Figures (6)

  • Figure 1: Overview of the proposed agentic workflow for materials discovery.
  • Figure 2: Workflow for LLM-generated property retrieval. If a property is not available in a database, GPT-5-mini synthesizes and repairs a custom retrieval routine until valid outputs are obtained.
  • Figure 3: Comparison of retrieval count and accuracy across GPT querying, conventional RAG, and the Map--Reduce method. Only Map--Reduce achieves both high accuracy and broad coverage, enabling reliable extraction of Debye-temperature distributions from literature.
  • Figure 4: Debye-temperature distribution comparison between the Materials Project database, literature-derived evidence (Map--Reduce), and MKNA-modified stable candidates. The database distribution (left axis) spans a broad range ($\sim$100--2200 K), while the literature-derived evidence concentrates in the low-to-mid regime and supports the grounded threshold at $\Theta_D>800$ K. After modification and stability filtering, MKNA’s stable candidates exhibit a pronounced shift toward the ultra-stiff regime (1500--1700 K). Dashed curves denote KDE estimates.
  • Figure 5: Structure modification workflow via substitution and perturbation. Prototype structures are selected, expanded, substituted using group-wise and approximate valence-preserving rules, and perturbed to diversify local environments while maintaining physically reasonable interatomic distances.
  • ...and 1 more figures