Table of Contents
Fetching ...

CPLLM: Clinical Prediction with Large Language Models

Ofir Ben Shoham, Nadav Rappoport

TL;DR

This work introduces CPLLM, a prompt-tuned large language model framework for clinical prediction using structured EHR data. By fine-tuning Llama2 and BioMedLM with PEFT/QLoRA and augmenting the tokenizer with additional medical tokens, CPLLM achieves state-of-the-art results in both diagnosis prediction and hospital readmission across MIMIC-IV and eICU-CRD, without requiring domain-specific pre-training. The method emphasizes longer sequence handling, on-premise deployability, and robustness across diverse datasets, offering a practical pathway for integrating LLMs into clinical decision support. The study demonstrates consistent improvements in PR-AUC and ROC-AUC over strong baselines and provides reproducible code and data-processing pipelines.

Abstract

We present Clinical Prediction with Large Language Models (CPLLM), a method that involves fine-tuning a pre-trained Large Language Model (LLM) for clinical disease and readmission prediction. We utilized quantization and fine-tuned the LLM using prompts. For diagnosis prediction, we predict whether patients will be diagnosed with a target disease during their next visit or in the subsequent diagnosis, leveraging their historical diagnosis records. We compared our results to various baselines, including RETAIN, and Med-BERT, the current state-of-the-art model for disease prediction using temporal structured EHR data. In addition, We also evaluated CPLLM for patient hospital readmission prediction and compared our method's performance with benchmark baselines. Our experiments have shown that our proposed method, CPLLM, surpasses all the tested models in terms of PR-AUC and ROC-AUC metrics, showing state-of-the-art results for diagnosis prediction and patient hospital readmission prediction. Such a method can be easily implemented and integrated into the clinical process to help care providers estimate the next steps of patients

CPLLM: Clinical Prediction with Large Language Models

TL;DR

This work introduces CPLLM, a prompt-tuned large language model framework for clinical prediction using structured EHR data. By fine-tuning Llama2 and BioMedLM with PEFT/QLoRA and augmenting the tokenizer with additional medical tokens, CPLLM achieves state-of-the-art results in both diagnosis prediction and hospital readmission across MIMIC-IV and eICU-CRD, without requiring domain-specific pre-training. The method emphasizes longer sequence handling, on-premise deployability, and robustness across diverse datasets, offering a practical pathway for integrating LLMs into clinical decision support. The study demonstrates consistent improvements in PR-AUC and ROC-AUC over strong baselines and provides reproducible code and data-processing pipelines.

Abstract

We present Clinical Prediction with Large Language Models (CPLLM), a method that involves fine-tuning a pre-trained Large Language Model (LLM) for clinical disease and readmission prediction. We utilized quantization and fine-tuned the LLM using prompts. For diagnosis prediction, we predict whether patients will be diagnosed with a target disease during their next visit or in the subsequent diagnosis, leveraging their historical diagnosis records. We compared our results to various baselines, including RETAIN, and Med-BERT, the current state-of-the-art model for disease prediction using temporal structured EHR data. In addition, We also evaluated CPLLM for patient hospital readmission prediction and compared our method's performance with benchmark baselines. Our experiments have shown that our proposed method, CPLLM, surpasses all the tested models in terms of PR-AUC and ROC-AUC metrics, showing state-of-the-art results for diagnosis prediction and patient hospital readmission prediction. Such a method can be easily implemented and integrated into the clinical process to help care providers estimate the next steps of patients
Paper Structure (17 sections, 1 figure, 4 tables)

This paper contains 17 sections, 1 figure, 4 tables.

Figures (1)

  • Figure 1: Illustration of the fine-tuning process for diagnosis prediction. (A) An example of EHR structured data. The patient has three diagnoses. (B) Patient's historical data is extracted from the EHR, and decoded to a textual list of descriptions. (C) The decoded textual data is then injected into a designed prompt for fine-tuning the LLM. Fine-tuning prompts consist of a general description, the patient's diagnosis history, and a label. The label is set to 1 when the patient is diagnosed with the outcome of interest (e.g., Adult Respiratory Failure in the subsequent diagnosis or during the next admission, depending on the task.