XiHeFusion: Harnessing Large Language Models for Science Communication in Nuclear Fusion
Xiao Wang, Qingquan Yang, Fuling Wang, Qiang Chen, Wentao Wu, Yu Jin, Jingtao Jiang, Liye Jin, Bo Jiang, Dengdi Sun, Wanli Lv, Meiwen Chen, Zehua Chen, Guosheng Xu, Jin Tang
TL;DR
XiHeFusion addresses the need for accessible, expert-level science communication in nuclear fusion by fine-tuning the open-source Qwen2.5-14B model on a large, multi-source fusion knowledge corpus and integrating Chain-of-Thought reasoning to improve structured, logical responses. The authors build a bilingual, open-source fusion LLM, deploy a long-context Transformer architecture, and employ targeted optimization (SFT, math reasoning data, and cross-language transfer) to strengthen domain accuracy and reasoning. A comprehensive evaluation using a 180+ question nuclear fusion assessment, along with case studies and cross-model comparisons, demonstrates XiHeFusion’s capability to explain fusion concepts and support learners and researchers. This work advances public understanding and potential collaboration in fusion research by providing an accessible, reasoning-capable AI assistant, released for community use under an Apache 2.0 license.
Abstract
Nuclear fusion is one of the most promising ways for humans to obtain infinite energy. Currently, with the rapid development of artificial intelligence, the mission of nuclear fusion has also entered a critical period of its development. How to let more people to understand nuclear fusion and join in its research is one of the effective means to accelerate the implementation of fusion. This paper proposes the first large model in the field of nuclear fusion, XiHeFusion, which is obtained through supervised fine-tuning based on the open-source large model Qwen2.5-14B. We have collected multi-source knowledge about nuclear fusion tasks to support the training of this model, including the common crawl, eBooks, arXiv, dissertation, etc. After the model has mastered the knowledge of the nuclear fusion field, we further used the chain of thought to enhance its logical reasoning ability, making XiHeFusion able to provide more accurate and logical answers. In addition, we propose a test questionnaire containing 180+ questions to assess the conversational ability of this science popularization large model. Extensive experimental results show that our nuclear fusion dialogue model, XiHeFusion, can perform well in answering science popularization knowledge. The pre-trained XiHeFusion model is released on https://github.com/Event-AHU/XiHeFusion.
