ClinicalMamba: A Generative Clinical Language Model on Longitudinal Clinical Notes
Zhichao Yang, Avijit Mitra, Sunjae Kwon, Hong Yu
TL;DR
ClinicalMamba addresses the challenge of modeling long-range information in clinical notes by extending context length to 16,000 tokens using a selective state-space mechanism within a Mamba-based architecture. Pretrained on longitudinal MIMIC-III notes, the 130M and 2.8B parameter variants demonstrate superior long-context information extraction, outperforming Mamba, clinical Llama, and zero-shot GPT-4 on tasks like cohort selection and ICD coding, while maintaining favorable perplexity-throughput trade-offs. The work introduces a prompt-based fine-tuning approach to enable few-shot adaptation and provides publicly released models to foster longitudinal clinical NLP research. Overall, the results suggest that long-context generative clinical LMs can achieve high accuracy with reduced compute, enabling scalable, longitudinal analysis of patient histories.
Abstract
The advancement of natural language processing (NLP) systems in healthcare hinges on language model ability to interpret the intricate information contained within clinical notes. This process often requires integrating information from various time points in a patient's medical history. However, most earlier clinical language models were pretrained with a context length limited to roughly one clinical document. In this study, We introduce ClinicalMamba, a specialized version of the Mamba language model, pretrained on a vast corpus of longitudinal clinical notes to address the unique linguistic characteristics and information processing needs of the medical domain. ClinicalMamba, with 130 million and 2.8 billion parameters, demonstrates a superior performance in modeling clinical language across extended text lengths compared to Mamba and clinical Llama. With few-shot learning, ClinicalMamba achieves notable benchmarks in speed and accuracy, outperforming existing clinical language models and general domain large models like GPT-4 in longitudinal clinical notes information extraction tasks.
