Table of Contents
Fetching ...

MatterChat: A Multi-Modal LLM for Material Science

Yingheng Tang, Wenbin Xu, Jie Cao, Weilu Gao, Steve Farrell, Benjamin Erichson, Michael W. Mahoney, Andy Nonaka, Zhi Yao

TL;DR

MatterChat tackles the challenge of integrating full-resolution atomic structures into language-based reasoning for materials science by introducing a bridging module that aligns a pretrained interatomic potential with a pretrained LLM. The architecture combines a Graph-based Material Processing Branch (CHGNet) with a Language Processing Branch (Mistral 7B) through a BLIP2-inspired Bridge Model, enabling effective multi-modal reasoning and text generation. Across nine material-property tasks, MatterChat surpasses open-source LLMs and physical ML baselines in both classification and quantitative predictions, while enabling advanced scientific reasoning and step-by-step synthesis guidance. The approach leverages Retrieval-Augmented Generation and structure-aware embeddings to improve robustness and applicability for accelerated material discovery, with potential impact on energy, electronics, and beyond.

Abstract

Understanding and predicting the properties of inorganic materials is crucial for accelerating advancements in materials science and driving applications in energy, electronics, and beyond. Integrating material structure data with language-based information through multi-modal large language models (LLMs) offers great potential to support these efforts by enhancing human-AI interaction. However, a key challenge lies in integrating atomic structures at full resolution into LLMs. In this work, we introduce MatterChat, a versatile structure-aware multi-modal LLM that unifies material structural data and textual inputs into a single cohesive model. MatterChat employs a bridging module to effectively align a pretrained machine learning interatomic potential with a pretrained LLM, reducing training costs and enhancing flexibility. Our results demonstrate that MatterChat significantly improves performance in material property prediction and human-AI interaction, surpassing general-purpose LLMs such as GPT-4. We also demonstrate its usefulness in applications such as more advanced scientific reasoning and step-by-step material synthesis.

MatterChat: A Multi-Modal LLM for Material Science

TL;DR

MatterChat tackles the challenge of integrating full-resolution atomic structures into language-based reasoning for materials science by introducing a bridging module that aligns a pretrained interatomic potential with a pretrained LLM. The architecture combines a Graph-based Material Processing Branch (CHGNet) with a Language Processing Branch (Mistral 7B) through a BLIP2-inspired Bridge Model, enabling effective multi-modal reasoning and text generation. Across nine material-property tasks, MatterChat surpasses open-source LLMs and physical ML baselines in both classification and quantitative predictions, while enabling advanced scientific reasoning and step-by-step synthesis guidance. The approach leverages Retrieval-Augmented Generation and structure-aware embeddings to improve robustness and applicability for accelerated material discovery, with potential impact on energy, electronics, and beyond.

Abstract

Understanding and predicting the properties of inorganic materials is crucial for accelerating advancements in materials science and driving applications in energy, electronics, and beyond. Integrating material structure data with language-based information through multi-modal large language models (LLMs) offers great potential to support these efforts by enhancing human-AI interaction. However, a key challenge lies in integrating atomic structures at full resolution into LLMs. In this work, we introduce MatterChat, a versatile structure-aware multi-modal LLM that unifies material structural data and textual inputs into a single cohesive model. MatterChat employs a bridging module to effectively align a pretrained machine learning interatomic potential with a pretrained LLM, reducing training costs and enhancing flexibility. Our results demonstrate that MatterChat significantly improves performance in material property prediction and human-AI interaction, surpassing general-purpose LLMs such as GPT-4. We also demonstrate its usefulness in applications such as more advanced scientific reasoning and step-by-step material synthesis.

Paper Structure

This paper contains 12 sections, 2 equations, 5 figures, 1 table.

Figures (5)

  • Figure 1: Overview of MatterChat: a modular multi-modal LLM for material-based question-answering. (a) MatterChat architecture: The system includes a material encoder that generates atom embeddings and a LLM that processes language data. These components are connected by a trainable bridge model, which aligns material structure with natural language to support tasks such as material description and property prediction. (b) Elemental distribution across 142,899 compositions, representing the dataset’s compositional diversity. (c) Dataset distribution shown by space groups (outer ring) and crystal Systems (inner ring), illustrating structural variation within the dataset.
  • Figure 2: MatterChat accurately predicts material properties and outperforms state-of-the-art LLMs. (a) Illustration of multi-modal material property queries using MatterChat. The model accurately interprets user prompts to predict chemical formulas, crystallographic properties, stability, electronic bandgap, magnetic order, and energy metrics of materials. The three panels demonstrate the framework’s ability to address diverse material science inquiries, showing its alignment of graph-based and textual embeddings for precise question answering. (b) Comparative evaluation of formation energy predictions for newly discovered material from GNoME Merchant2023. Predictions from Matterchat compared against the ground truth values along with evaluations from commercial LLMs (Gemini 2312.11805, GPT-4o 2410.21276 and DeepSeek 2501.12948). The results show the accuracy and stability of the Matterchat in quantitative material evaluation tasks, which closely aligns with the ground truth, demonstrating its ability to integrate material graph embeddings for precise property prediction.
  • Figure 3: MatterChat has the ability to solve more sophisticated tasks inherited from the pretrained LLM. (a) Material property query for silicon (Si), including its chemical formula, space group, stability, and the reasoning for why it is not stable under standard conditions. (b) Highlights a material query for Gallium Nitride (GaN), providing its chemical formula, space group, and a step-by-step synthesis procedure using methods like HVPE, MOCVD, and MBE. (c) Material query interaction, Yttrium Iron Garnet (YIG, $\rm {Y_3Fe_5O_{12}}$), detailing its chemical formula, space group, and a simplified step-by-step synthesis procedure using the solid-state reaction method.
  • Figure 4: UMAP visualization of structural embeddings extracted from the bridge model. (a) Visualization of samples containing $\rm Si$ and $\rm C$ elements from the Material Project dataset, showing how materials cluster based on their structural embeddings extracted from the bridge model. The value indicates the structural similarity calculated using the SOAP descriptor in combination with the REMatch kernel (see Methods for further details). (b, c) Visualizations of SiC subgroup color-coded by structural similarity and formation energy. The two clusters exhibit high structural similarity, with formation energy further assisting in distinguishing between them. (d, e) Visualizations of Si subgroup color-coded by structural similarity and formation energy. Two clusters demonstrate a smooth transition in both structural similarity and formation energy, indicating that both factors captured by the structural embeddings contribute to the observed clustering. (f) Proposed Multi-modal Retrieval-Augmented Generation (RAG) for robust prediction.
  • Figure 5: Performance comparison of MatterChat, open-source LLMs (Vicuna, Mistral), and physical pre-trained models (SchNet, CHGNet) across nine material property tasks. (a)–(f) show classification task accuracies, where MatterChat consistently outperforms other models. Panels (g)–(i) present root mean absolute error (RMSE) results for numerical property predictions, demonstrating MatterChat’s superior precision in formation energy, energy above the hull, and bandgap tasks.