Table of Contents
Fetching ...

HoneyComb: A Flexible LLM-Based Agent System for Materials Science

Huan Zhang, Yu Song, Ziyu Hou, Santiago Miret, Bang Liu

TL;DR

This work introducesHoneyComb, an open-source LLM-based agent system tailored for materials science that combines a curated knowledge base (MatSciKB), a tool hub (ToolHub) built via Inductive Tool Construction, and a Retriever to select relevant sources. The architecture enables external knowledge access and domain-specific tool use to address common LLM shortcomings in materials science, such as inaccuracies and computational gaps. Empirical evaluation on MaScQA and SciQA across multiple LLM backbones shows consistent performance gains, with ablation confirming the additive value of integrating both MatSciKB and ToolHub. The approach demonstrates strong potential for extending to other knowledge-intensive domains and underscores the practical impact of domain-specific agent systems for scientific research.

Abstract

The emergence of specialized large language models (LLMs) has shown promise in addressing complex tasks for materials science. Many LLMs, however, often struggle with distinct complexities of material science tasks, such as materials science computational tasks, and often rely heavily on outdated implicit knowledge, leading to inaccuracies and hallucinations. To address these challenges, we introduce HoneyComb, the first LLM-based agent system specifically designed for materials science. HoneyComb leverages a novel, high-quality materials science knowledge base (MatSciKB) and a sophisticated tool hub (ToolHub) to enhance its reasoning and computational capabilities tailored to materials science. MatSciKB is a curated, structured knowledge collection based on reliable literature, while ToolHub employs an Inductive Tool Construction method to generate, decompose, and refine API tools for materials science. Additionally, HoneyComb leverages a retriever module that adaptively selects the appropriate knowledge source or tools for specific tasks, thereby ensuring accuracy and relevance. Our results demonstrate that HoneyComb significantly outperforms baseline models across various tasks in materials science, effectively bridging the gap between current LLM capabilities and the specialized needs of this domain. Furthermore, our adaptable framework can be easily extended to other scientific domains, highlighting its potential for broad applicability in advancing scientific research and applications.

HoneyComb: A Flexible LLM-Based Agent System for Materials Science

TL;DR

This work introducesHoneyComb, an open-source LLM-based agent system tailored for materials science that combines a curated knowledge base (MatSciKB), a tool hub (ToolHub) built via Inductive Tool Construction, and a Retriever to select relevant sources. The architecture enables external knowledge access and domain-specific tool use to address common LLM shortcomings in materials science, such as inaccuracies and computational gaps. Empirical evaluation on MaScQA and SciQA across multiple LLM backbones shows consistent performance gains, with ablation confirming the additive value of integrating both MatSciKB and ToolHub. The approach demonstrates strong potential for extending to other knowledge-intensive domains and underscores the practical impact of domain-specific agent systems for scientific research.

Abstract

The emergence of specialized large language models (LLMs) has shown promise in addressing complex tasks for materials science. Many LLMs, however, often struggle with distinct complexities of material science tasks, such as materials science computational tasks, and often rely heavily on outdated implicit knowledge, leading to inaccuracies and hallucinations. To address these challenges, we introduce HoneyComb, the first LLM-based agent system specifically designed for materials science. HoneyComb leverages a novel, high-quality materials science knowledge base (MatSciKB) and a sophisticated tool hub (ToolHub) to enhance its reasoning and computational capabilities tailored to materials science. MatSciKB is a curated, structured knowledge collection based on reliable literature, while ToolHub employs an Inductive Tool Construction method to generate, decompose, and refine API tools for materials science. Additionally, HoneyComb leverages a retriever module that adaptively selects the appropriate knowledge source or tools for specific tasks, thereby ensuring accuracy and relevance. Our results demonstrate that HoneyComb significantly outperforms baseline models across various tasks in materials science, effectively bridging the gap between current LLM capabilities and the specialized needs of this domain. Furthermore, our adaptable framework can be easily extended to other scientific domains, highlighting its potential for broad applicability in advancing scientific research and applications.
Paper Structure (19 sections, 4 figures, 4 tables, 1 algorithm)

This paper contains 19 sections, 4 figures, 4 tables, 1 algorithm.

Figures (4)

  • Figure 1: The overall architecture of HoneyComb. The model initiates with a query input that activates the knowledge retrieval phase, where pertinent data entries and atom function are extracted from the MatSciKB and Tool-Hub respectively. The Executor iterative calls the relevant tools from the Tool-Hub, evaluating and refining these calls until a solution that adequately solves the query emerges. The preliminary solution generated by these tools is combined with relevant data entries, and then undergoes further processing by the Retriever. Finally, the Retriever consolidates and filters these input, ultimately feeding them into the LLM for final answer generation.
  • Figure 2: Tool Assessor and Executor interaction cycle in HoneyComb.
  • Figure 3: Improvements of various LLMs integrated with HoneyComb compared to relevant baseline LLMs for different materials science tasks. With few exceptions, HoneyComb improves the performance of all LLMs across all tasks showing the utility of tool augmentation.
  • Figure 5: An example of inductive tool construction