From Text to Multimodality: Exploring the Evolution and Impact of Large Language Models in Medical Practice
Qian Niu, Keyu Chen, Ming Li, Pohsun Feng, Ziqian Bi, Lawrence KQ Yan, Yichao Zhang, Caitlyn Heqi Yin, Cheng Fei, Junyu Liu, Tianyang Wang, Yunze Wang, Silin Chen, Ming Liu, Benji Peng, Xinyuan Song, Ziyuan Qin, Riyang Bao, Zekun Jiang
TL;DR
This review addresses the transition from text-based LLMs to Multimodal LLMs (MLLMs) in medicine amid an unprecedented data surge. It surveys architectural foundations, multimodal data types, and modality-alignment strategies, and maps applications spanning clinical decision support, medical imaging, patient engagement, and research. The authors identify critical gaps in data availability, bias, privacy, interpretability, and evaluation, and argue for robust ethical and regulatory frameworks. They advocate for interdisciplinary collaboration and standardized evaluation to guide safe, effective deployment of MLLMs in clinical practice.
Abstract
Large Language Models (LLMs) have rapidly evolved from text-based systems to multimodal platforms, significantly impacting various sectors including healthcare. This comprehensive review explores the progression of LLMs to Multimodal Large Language Models (MLLMs) and their growing influence in medical practice. We examine the current landscape of MLLMs in healthcare, analyzing their applications across clinical decision support, medical imaging, patient engagement, and research. The review highlights the unique capabilities of MLLMs in integrating diverse data types, such as text, images, and audio, to provide more comprehensive insights into patient health. We also address the challenges facing MLLM implementation, including data limitations, technical hurdles, and ethical considerations. By identifying key research gaps, this paper aims to guide future investigations in areas such as dataset development, modality alignment methods, and the establishment of ethical guidelines. As MLLMs continue to shape the future of healthcare, understanding their potential and limitations is crucial for their responsible and effective integration into medical practice.
