Table of Contents
Fetching ...

Xiwu: A Basis Flexible and Learnable LLM for High Energy Physics

Zhengde Zhang, Yiyu Zhang, Haodong Yao, Jianwen Luo, Rui Zhao, Bo Huang, Jiameng Zhao, Yipu Liao, Ke Li, Lina Zhao, Jun Cao, Fazhi Qi, Changzheng Yuan

TL;DR

Xiwu introduces a basis-flexible, learnable LLM tailored for high-energy physics, combining a flexible foundation (Vicuna/Llama-2 lineage), seed-fission data generation, and a vector-store–driven just-in-time learning system to rapidly embed domain knowledge. The architecture pairs a data engine, external memory, and an intelligent agent (HepAI-DDP) with two learning loops to balance fast knowledge updates and long-term model understanding. Empirical results show Xiwu-13B significantly outperforms Vicuna-13B on HEP QA and approaches a fraction of ChatGPT-175B, while enabling low-cost, rapid knowledge refreshes and collaborative upgrades. The work also provides a blueprint for adapting LLMs to other specialized domains via open-source tools and a reproducible data-collection pipeline.

Abstract

Large Language Models (LLMs) are undergoing a period of rapid updates and changes, with state-of-the-art (SOTA) model frequently being replaced. When applying LLMs to a specific scientific field, it's challenging to acquire unique domain knowledge while keeping the model itself advanced. To address this challenge, a sophisticated large language model system named as Xiwu has been developed, allowing you switch between the most advanced foundation models and quickly teach the model domain knowledge. In this work, we will report on the best practices for applying LLMs in the field of high-energy physics (HEP), including: a seed fission technology is proposed and some data collection and cleaning tools are developed to quickly obtain domain AI-Ready dataset; a just-in-time learning system is implemented based on the vector store technology; an on-the-fly fine-tuning system has been developed to facilitate rapid training under a specified foundation model. The results show that Xiwu can smoothly switch between foundation models such as LLaMA, Vicuna, ChatGLM and Grok-1. The trained Xiwu model is significantly outperformed the benchmark model on the HEP knowledge question-and-answering and code generation. This strategy significantly enhances the potential for growth of our model's performance, with the hope of surpassing GPT-4 as it evolves with the development of open-source models. This work provides a customized LLM for the field of HEP, while also offering references for applying LLM to other fields, the corresponding codes are available on Github.

Xiwu: A Basis Flexible and Learnable LLM for High Energy Physics

TL;DR

Xiwu introduces a basis-flexible, learnable LLM tailored for high-energy physics, combining a flexible foundation (Vicuna/Llama-2 lineage), seed-fission data generation, and a vector-store–driven just-in-time learning system to rapidly embed domain knowledge. The architecture pairs a data engine, external memory, and an intelligent agent (HepAI-DDP) with two learning loops to balance fast knowledge updates and long-term model understanding. Empirical results show Xiwu-13B significantly outperforms Vicuna-13B on HEP QA and approaches a fraction of ChatGPT-175B, while enabling low-cost, rapid knowledge refreshes and collaborative upgrades. The work also provides a blueprint for adapting LLMs to other specialized domains via open-source tools and a reproducible data-collection pipeline.

Abstract

Large Language Models (LLMs) are undergoing a period of rapid updates and changes, with state-of-the-art (SOTA) model frequently being replaced. When applying LLMs to a specific scientific field, it's challenging to acquire unique domain knowledge while keeping the model itself advanced. To address this challenge, a sophisticated large language model system named as Xiwu has been developed, allowing you switch between the most advanced foundation models and quickly teach the model domain knowledge. In this work, we will report on the best practices for applying LLMs in the field of high-energy physics (HEP), including: a seed fission technology is proposed and some data collection and cleaning tools are developed to quickly obtain domain AI-Ready dataset; a just-in-time learning system is implemented based on the vector store technology; an on-the-fly fine-tuning system has been developed to facilitate rapid training under a specified foundation model. The results show that Xiwu can smoothly switch between foundation models such as LLaMA, Vicuna, ChatGLM and Grok-1. The trained Xiwu model is significantly outperformed the benchmark model on the HEP knowledge question-and-answering and code generation. This strategy significantly enhances the potential for growth of our model's performance, with the hope of surpassing GPT-4 as it evolves with the development of open-source models. This work provides a customized LLM for the field of HEP, while also offering references for applying LLM to other fields, the corresponding codes are available on Github.
Paper Structure (74 sections, 2 equations, 7 figures)

This paper contains 74 sections, 2 equations, 7 figures.

Figures (7)

  • Figure 1: The hallucination of GPT-4 when answering domain questions.
  • Figure 2: The architecture of Xiwu large language model system.
  • Figure 3: The data resources and acquisition methods. (a) Eight domains related to High Energy Physics that are of our concern; (b) Four methods employed to gather the dataset
  • Figure 4: The seed fission technology for getting diverse and in-depth data
  • Figure 5: The illustration algorithm components and training technologies
  • ...and 2 more figures