Large language models in healthcare and medical domain: A review
Zabir Al Nazi, Wei Peng
TL;DR
LLMs offer transformative potential for healthcare by synthesizing vast medical knowledge and supporting nuanced clinical tasks, yet their deployment must address safety, privacy, and bias concerns. The paper surveys Transformer-based architectures, large foundational models, and multimodal LLMs, then catalogs healthcare-focused models, use cases, explainability approaches, and performance benchmarks. It provides quantitative comparisons across medical datasets (e.g., MedQA, MedNLI, PubMedQA) and discusses limitations such as hallucinations, data privacy, and regulatory compliance, while outlining future directions like federated learning and integrated multi-modal data. Overall, the work guides researchers and clinicians in selecting appropriate, safe, and compliant AI systems for healthcare deployment and highlights the governance needed to translate advances into real-world patient benefits.
Abstract
The deployment of large language models (LLMs) within the healthcare sector has sparked both enthusiasm and apprehension. These models exhibit the remarkable capability to provide proficient responses to free-text queries, demonstrating a nuanced understanding of professional medical knowledge. This comprehensive survey delves into the functionalities of existing LLMs designed for healthcare applications, elucidating the trajectory of their development, starting from traditional Pretrained Language Models (PLMs) to the present state of LLMs in healthcare sector. First, we explore the potential of LLMs to amplify the efficiency and effectiveness of diverse healthcare applications, particularly focusing on clinical language understanding tasks. These tasks encompass a wide spectrum, ranging from named entity recognition and relation extraction to natural language inference, multi-modal medical applications, document classification, and question-answering. Additionally, we conduct an extensive comparison of the most recent state-of-the-art LLMs in the healthcare domain, while also assessing the utilization of various open-source LLMs and highlighting their significance in healthcare applications. Furthermore, we present the essential performance metrics employed to evaluate LLMs in the biomedical domain, shedding light on their effectiveness and limitations. Finally, we summarize the prominent challenges and constraints faced by large language models in the healthcare sector, offering a holistic perspective on their potential benefits and shortcomings. This review provides a comprehensive exploration of the current landscape of LLMs in healthcare, addressing their role in transforming medical applications and the areas that warrant further research and development.
