Table of Contents
Fetching ...

A Comprehensive Survey of Foundation Models in Medicine

Wasif Khan, Seowung Leem, Kyle B. See, Joshua K. Wong, Shaoting Zhang, Ruogu Fang

TL;DR

Foundation models offer a unified, data-efficient path to integrate diverse medical data, yet their deployment in healthcare demands domain-specific tailoring, robust evaluation, and privacy safeguards. This survey provides a comprehensive taxonomy, traces historical progress, surveys flagship medical FMs, and catalogs applications across clinical NLP, medical imaging, omics, and beyond, highlighting both opportunities and challenges. Key contributions include a structured overview of architectures, tasks, datasets, and representative models, plus open research questions and lessons learned to guide responsible deployment and future work. The work underscores the need for multi-modal, privacy-preserving, and interpretable FM systems that can scale across clinical settings and support personalized medicine.

Abstract

Foundation models (FMs) are large-scale deep learning models trained on massive datasets, often using self-supervised learning techniques. These models serve as a versatile base for a wide range of downstream tasks, including those in medicine and healthcare. FMs have demonstrated remarkable success across multiple healthcare domains. However, existing surveys in this field do not comprehensively cover all areas where FMs have made significant strides. In this survey, we present a comprehensive review of FMs in medicine, focusing on their evolution, learning strategies, flagship models, applications, and associated challenges. We examine how prominent FMs, such as the BERT and GPT families, are transforming various aspects of healthcare, including clinical large language models, medical image analysis, and omics research. Additionally, we provide a detailed taxonomy of FM-enabled healthcare applications, spanning clinical natural language processing, medical computer vision, graph learning, and other biology- and omics- related tasks. Despite the transformative potentials of FMs, they also pose unique challenges. This survey delves into these challenges and highlights open research questions and lessons learned to guide researchers and practitioners. Our goal is to provide valuable insights into the capabilities of FMs in health, facilitating responsible deployment and mitigating associated risks.

A Comprehensive Survey of Foundation Models in Medicine

TL;DR

Foundation models offer a unified, data-efficient path to integrate diverse medical data, yet their deployment in healthcare demands domain-specific tailoring, robust evaluation, and privacy safeguards. This survey provides a comprehensive taxonomy, traces historical progress, surveys flagship medical FMs, and catalogs applications across clinical NLP, medical imaging, omics, and beyond, highlighting both opportunities and challenges. Key contributions include a structured overview of architectures, tasks, datasets, and representative models, plus open research questions and lessons learned to guide responsible deployment and future work. The work underscores the need for multi-modal, privacy-preserving, and interpretable FM systems that can scale across clinical settings and support personalized medicine.

Abstract

Foundation models (FMs) are large-scale deep learning models trained on massive datasets, often using self-supervised learning techniques. These models serve as a versatile base for a wide range of downstream tasks, including those in medicine and healthcare. FMs have demonstrated remarkable success across multiple healthcare domains. However, existing surveys in this field do not comprehensively cover all areas where FMs have made significant strides. In this survey, we present a comprehensive review of FMs in medicine, focusing on their evolution, learning strategies, flagship models, applications, and associated challenges. We examine how prominent FMs, such as the BERT and GPT families, are transforming various aspects of healthcare, including clinical large language models, medical image analysis, and omics research. Additionally, we provide a detailed taxonomy of FM-enabled healthcare applications, spanning clinical natural language processing, medical computer vision, graph learning, and other biology- and omics- related tasks. Despite the transformative potentials of FMs, they also pose unique challenges. This survey delves into these challenges and highlights open research questions and lessons learned to guide researchers and practitioners. Our goal is to provide valuable insights into the capabilities of FMs in health, facilitating responsible deployment and mitigating associated risks.
Paper Structure (77 sections, 8 figures, 2 tables)

This paper contains 77 sections, 8 figures, 2 tables.

Figures (8)

  • Figure 1: Outline of the survey paper. We present an overview of medical FMs, followed by their applications in healthcare. Additionally, we offer in-depth insights into the taxonomy of medical FMs, the opportunities they enable, the challenges they face, and open research questions and lessons learned.
  • Figure 2: Learning architecture, model size, and training data used by representative foundation models. Details can be found in Supplementary Materials Section II.
  • Figure 3: Taxonomy of medical FMs evolved in a variety of complex healthcare data and tasks. Each medical FMs is represented in different color by on diverse base model architectures — such as BERT, Transformers, GPT, CNN, and LLaMA — each tailored to the unique demands of different medical tasks.
  • Figure 4: Transformer architecture vaswani2017attention employs an encoder-decoder structure with multiple stacked layers (Nx), each containing multi-head self-attention and feed-forward layers.
  • Figure 5: Attention mechanism in Transformer architecture.
  • ...and 3 more figures