Foundation Model in Biomedicine
Xiangrui Liu, Yuanyuan Zhang, Qianyu Shang, Yingzhou Lu, Changchang Yin, Xiaoling Hu, Xiaoou Liu, Lulu Chen, Alexander Rodríguez, Yezhou Yang, Ping Zhang, Jintai Chen, Shan Du, Huaxiu Yao, Sheng Wang, Tianfan Fu, Xiao Wang
TL;DR
The paper surveys the integration of foundation models into biomedicine, distinguishing discriminative and generative approaches and detailing how self-supervised pretraining on large unlabeled datasets yields transferable biomedical representations. It maps applications across computational biology, drug discovery, clinical informatics, medical imaging, and public health, highlighting concrete models and tasks such as genome-language representations, RNA-expression embeddings, protein design, cross-modal medical imaging, and EHR-based phenotyping. Key contributions include a structured taxonomy of applications, representative model families (e.g., MLM/contrastive discriminative methods and autoregressive generative methods), and discussion of challenges like data scarcity, generalization, and interpretability, with directions for future research. The work underscores the potential of foundation models to accelerate biomedical discovery, clinical decision support, and public health interventions through scalable, multimodal, and domain-specific pretraining and adaptation. Overall, it frames a roadmap for leveraging large-scale pretraining to advance precision medicine and population health at scale.
Abstract
Foundation models, first introduced in 2021, refer to large-scale pretrained models (e.g., large language models (LLMs) and vision-language models (VLMs)) that learn from extensive unlabeled datasets through unsupervised methods, enabling them to excel in diverse downstream tasks. These models, like GPT, can be adapted to various applications such as question answering and visual understanding, outperforming task-specific AI models and earning their name due to broad applicability across fields. The development of biomedical foundation models marks a significant milestone in the use of artificial intelligence (AI) to understand complex biological phenomena and advance medical research and practice. This survey explores the potential of foundation models in diverse domains within biomedical fields, including computational biology, drug discovery and development, clinical informatics, medical imaging, and public health. The purpose of this survey is to inspire ongoing research in the application of foundation models to health science.
