TCM-FTP: Fine-Tuning Large Language Models for Herbal Prescription Prediction

Xingzhi Zhou; Xin Dong; Chunhao Li; Yuning Bai; Yulong Xu; Ka Chun Cheung; Simon See; Xinpeng Song; Runshun Zhang; Xuezhong Zhou; Nevin L. Zhang

TCM-FTP: Fine-Tuning Large Language Models for Herbal Prescription Prediction

Xingzhi Zhou, Xin Dong, Chunhao Li, Yuning Bai, Yulong Xu, Ka Chun Cheung, Simon See, Xinpeng Song, Runshun Zhang, Xuezhong Zhou, Nevin L. Zhang

TL;DR

This study tackles the challenge of predicting Traditional Chinese Medicine prescriptions from symptoms by creating DigestDS, a high-quality clinical dataset, and proposing TCM-FTP, a LoRA-based fine-tuning approach for large language models with order-agnostic data augmentation. The method achieves state-of-the-art herb prediction ($F1$-score up to $0.8031$) and accurate herb dosage ($NMSE \approx 0.0604$) on DigestDS, significantly outperforming baselines and generic LLMs. By combining PEFT and domain-aware data augmentation, the work demonstrates that fine-tuning specialized LLMs is crucial for reliable TCM prescription prediction. The results suggest practical potential for assisting TCM practitioners, with future work aimed at deeper integration of domain knowledge to further improve utility and safety.

Abstract

Traditional Chinese medicine (TCM) has relied on specific combinations of herbs in prescriptions to treat various symptoms and signs for thousands of years. Predicting TCM prescriptions poses a fascinating technical challenge with significant practical implications. However, this task faces limitations due to the scarcity of high-quality clinical datasets and the complex relationship between symptoms and herbs. To address these issues, we introduce \textit{DigestDS}, a novel dataset comprising practical medical records from experienced experts in digestive system diseases. We also propose a method, TCM-FTP (TCM Fine-Tuning Pre-trained), to leverage pre-trained large language models (LLMs) via supervised fine-tuning on \textit{DigestDS}. Additionally, we enhance computational efficiency using a low-rank adaptation technique. Moreover, TCM-FTP incorporates data augmentation by permuting herbs within prescriptions, exploiting their order-agnostic nature. Impressively, TCM-FTP achieves an F1-score of 0.8031, significantly outperforming previous methods. Furthermore, it demonstrates remarkable accuracy in dosage prediction, achieving a normalized mean square error of 0.0604. In contrast, LLMs without fine-tuning exhibit poor performance. Although LLMs have demonstrated wide-ranging capabilities, our work underscores the necessity of fine-tuning for TCM prescription prediction and presents an effective way to accomplish this.

TCM-FTP: Fine-Tuning Large Language Models for Herbal Prescription Prediction

TL;DR

-score up to

) and accurate herb dosage (

) on DigestDS, significantly outperforming baselines and generic LLMs. By combining PEFT and domain-aware data augmentation, the work demonstrates that fine-tuning specialized LLMs is crucial for reliable TCM prescription prediction. The results suggest practical potential for assisting TCM practitioners, with future work aimed at deeper integration of domain knowledge to further improve utility and safety.

Abstract

Paper Structure (23 sections, 6 equations, 3 figures, 3 tables)

This paper contains 23 sections, 6 equations, 3 figures, 3 tables.

INTRODUCTION
BACKGROUND AND SIGNIFICANCE
Problem Definition
TCM Herbal Prescription Prediction
Large Language Models in TCM
Parameter Efficient Fine-Tuning
MATERIALS AND METHODS
Datasets
Data collection
Data processing
Data statistics
TCM-FTP
Low-rank Adaptation
Order-Agnostic Property
Baselines
...and 8 more sections

Figures (3)

Figure 1: Workflow of the TCM-FTP. Our work consists of four parts: “Data Collection” involves gathering and organizing raw data; “Data Processing” includes data preprocessing, prompt design, and integrating data augmentation to create a fine-tuning dataset; “Fine-tuning” utilizes the ShenNong LLM and LoRA technique to optimize the model; and “Evaluation” assesses the outcomes using both quantitative and qualitative evaluation metrics.
Figure 2: Distribution of the number of herbs in prescriptions The blue curves represent kernel density estimates.
Figure 3: Two Specific Case Analyses. We present two specific test cases, including inputs, outputs, and predictions of the models with evaluations of experts. Herb names/dosage weights are marked in red for corrected ones.

TCM-FTP: Fine-Tuning Large Language Models for Herbal Prescription Prediction

TL;DR

Abstract

TCM-FTP: Fine-Tuning Large Language Models for Herbal Prescription Prediction

Authors

TL;DR

Abstract

Table of Contents

Figures (3)