Table of Contents
Fetching ...

Foundation Model in Biomedicine

Xiangrui Liu, Yuanyuan Zhang, Qianyu Shang, Yingzhou Lu, Changchang Yin, Xiaoling Hu, Xiaoou Liu, Lulu Chen, Alexander Rodríguez, Yezhou Yang, Ping Zhang, Jintai Chen, Shan Du, Huaxiu Yao, Sheng Wang, Tianfan Fu, Xiao Wang

TL;DR

The paper surveys the integration of foundation models into biomedicine, distinguishing discriminative and generative approaches and detailing how self-supervised pretraining on large unlabeled datasets yields transferable biomedical representations. It maps applications across computational biology, drug discovery, clinical informatics, medical imaging, and public health, highlighting concrete models and tasks such as genome-language representations, RNA-expression embeddings, protein design, cross-modal medical imaging, and EHR-based phenotyping. Key contributions include a structured taxonomy of applications, representative model families (e.g., MLM/contrastive discriminative methods and autoregressive generative methods), and discussion of challenges like data scarcity, generalization, and interpretability, with directions for future research. The work underscores the potential of foundation models to accelerate biomedical discovery, clinical decision support, and public health interventions through scalable, multimodal, and domain-specific pretraining and adaptation. Overall, it frames a roadmap for leveraging large-scale pretraining to advance precision medicine and population health at scale.

Abstract

Foundation models, first introduced in 2021, refer to large-scale pretrained models (e.g., large language models (LLMs) and vision-language models (VLMs)) that learn from extensive unlabeled datasets through unsupervised methods, enabling them to excel in diverse downstream tasks. These models, like GPT, can be adapted to various applications such as question answering and visual understanding, outperforming task-specific AI models and earning their name due to broad applicability across fields. The development of biomedical foundation models marks a significant milestone in the use of artificial intelligence (AI) to understand complex biological phenomena and advance medical research and practice. This survey explores the potential of foundation models in diverse domains within biomedical fields, including computational biology, drug discovery and development, clinical informatics, medical imaging, and public health. The purpose of this survey is to inspire ongoing research in the application of foundation models to health science.

Foundation Model in Biomedicine

TL;DR

The paper surveys the integration of foundation models into biomedicine, distinguishing discriminative and generative approaches and detailing how self-supervised pretraining on large unlabeled datasets yields transferable biomedical representations. It maps applications across computational biology, drug discovery, clinical informatics, medical imaging, and public health, highlighting concrete models and tasks such as genome-language representations, RNA-expression embeddings, protein design, cross-modal medical imaging, and EHR-based phenotyping. Key contributions include a structured taxonomy of applications, representative model families (e.g., MLM/contrastive discriminative methods and autoregressive generative methods), and discussion of challenges like data scarcity, generalization, and interpretability, with directions for future research. The work underscores the potential of foundation models to accelerate biomedical discovery, clinical decision support, and public health interventions through scalable, multimodal, and domain-specific pretraining and adaptation. Overall, it frames a roadmap for leveraging large-scale pretraining to advance precision medicine and population health at scale.

Abstract

Foundation models, first introduced in 2021, refer to large-scale pretrained models (e.g., large language models (LLMs) and vision-language models (VLMs)) that learn from extensive unlabeled datasets through unsupervised methods, enabling them to excel in diverse downstream tasks. These models, like GPT, can be adapted to various applications such as question answering and visual understanding, outperforming task-specific AI models and earning their name due to broad applicability across fields. The development of biomedical foundation models marks a significant milestone in the use of artificial intelligence (AI) to understand complex biological phenomena and advance medical research and practice. This survey explores the potential of foundation models in diverse domains within biomedical fields, including computational biology, drug discovery and development, clinical informatics, medical imaging, and public health. The purpose of this survey is to inspire ongoing research in the application of foundation models to health science.

Paper Structure

This paper contains 27 sections, 3 figures.

Figures (3)

  • Figure 1: Overview of foundation models training strategies including masked language modeling for token recovery, contrastive learning for aligning representations across image pairs, and next-token prediction for autoregressive sequence modeling.
  • Figure 2: Overview of the foundation models in different biomedical fields. The foundation model is first pre-trained with massive unlabeled data in a self-supervised fashion. Then, it can be easily adapted for various downstream applications, including computational biology, drug discovery, public health, medical imaging, and clinical informatics.
  • Figure 3: Applications of foundation models across biomedical domains.(a) Computational biology: foundation models learn from DNA, RNA, and protein sequences for disease estimation, gene networks, and protein structure prediction. (b) Medical imaging: pathology, radiology, and retinal images enable segmentation, classification, and multimodal representation learning. (c) Clinical informatics: structured and unstructured health records support summarization, question answering, patient representation, and treatment effect estimation. (d) Drug discovery and development: molecular representations drive property prediction, drug repurposing, and predictive clinical trials. (e) Public health: multimodal signals from social media and epidemic metadata enable mental health surveillance and pandemic forecasting.