LlamaCare: A Large Medical Language Model for Enhancing Healthcare Knowledge Sharing

Maojun Sun

LlamaCare: A Large Medical Language Model for Enhancing Healthcare Knowledge Sharing

Maojun Sun

TL;DR

This work addresses the gap between broad LLMs and the precision required for medical knowledge by fine-tuning LLaMA-2 on medical data using low-carbon methods and introducing the Extended Classification Integration (ECI) to produce concise classification labels. It combines a three-step problem-solving prompt with joint optimization of text generation and classification, enabling safer and more actionable medical responses. The authors demonstrate competitive performance relative to state-of-the-art models with similar parameter counts, improve classification behavior via ECI, and release one-shot and few-shot datasets for PubMedQA and USMLE benchmarks. The approach offers a practical path toward environmentally friendly, knowledge-rich medical assistants capable of assisting clinicians and patients with reliable information while reducing computational overhead.

Abstract

Large language models (LLMs) have shown amazing capabilities in knowledge memorization and the present. However, when it comes to domain-specific knowledge and downstream tasks like medical, general LLMs are often unable to give precise answers. In addition, when people want LLMs to answer classification questions, they usually go through instruction tuning first. However, LLMs do not always give a direct index of the categorization after instruction tuning. In this paper, we proposed LlamaCare, a fine-tuned medical language model, and Extended Classification Integration(ECI), a module to handle classification problems of LLMs. Our contributions are : (i) We fine-tuned a large language model of medical knowledge with very low carbon emissions and achieved similar performance with ChatGPT by a 24G GPU. (ii) We solved the problem of redundant categorical answers and improved the performance of LLMs by proposing a new module called Extended Classification Integration. (iii) We released our processed data for one-shot and few-shot training for some benchmarks such as PubMedQA and USMLE 1-3 step. Our method achieves a close performance comparable to some state-of-the-art models with the same quantity of parameters on benchmarks, while being more environmentally friendly by using less GPU computation time. Our models, codes, and datasets can be found at \url{https://github.com/Stephen-SMJ/LLamaCare}.

LlamaCare: A Large Medical Language Model for Enhancing Healthcare Knowledge Sharing

TL;DR

Abstract

Paper Structure (36 sections, 6 equations, 3 figures, 4 tables)

This paper contains 36 sections, 6 equations, 3 figures, 4 tables.

Introduction
Background and Related Works
Large language model and medical language model
Fine-tune
Prompt Engineering
Methods and Experiments
DataSets for Medical Knowledge
Down-stream Instruction Tuning
Extended Classification Integration
Evaluation
BLEU Score
ROUGE Score
Human Evaluation
ChatGPT Evaluation
Baseline
...and 21 more sections

Figures (3)

Figure 1: Comparison between Instruction tuning and ECI. (a) Instruction tuning does not guarantee to respond to a single class. (b) Our method ECI gives a category every time.
Figure 2: Left part. This prompt motivates the model to think about the knowledge related to the question as well as extrapolating answers from knowledge. Fine-tuning LLMs by this prompt can avoid the phenomenon of the model reciting the answer. Right part. The ECI module allows the model to output an additional label each time during the training.
Figure 3: Loss of fine-tuning in experiments. (a) Fine-tune on medical text. (b) Evaluation of medical text. (c) Fine-tuning on benchmark datasets. (d) Evaluation of benchmark datasets.

LlamaCare: A Large Medical Language Model for Enhancing Healthcare Knowledge Sharing

TL;DR

Abstract

LlamaCare: A Large Medical Language Model for Enhancing Healthcare Knowledge Sharing

Authors

TL;DR

Abstract

Table of Contents

Figures (3)