Table of Contents
Fetching ...

Data-Centric Foundation Models in Computational Healthcare: A Survey

Yunkun Zhang, Jin Gao, Zheling Tan, Lingfeng Zhou, Kexin Ding, Mu Zhou, Shaoting Zhang, Dequan Wang

TL;DR

A wide range of data-centric approaches in the FM era (from model pre-training to inference) towards improving the healthcare workflow are investigated and a promising outlook of FM-based analytics to enhance the performance of patient outcome and clinical workflow in the evolving landscape of healthcare and medicine is offered.

Abstract

The advent of foundation models (FMs) as an emerging suite of AI techniques has struck a wave of opportunities in computational healthcare. The interactive nature of these models, guided by pre-training data and human instructions, has ignited a data-centric AI paradigm that emphasizes better data characterization, quality, and scale. In healthcare AI, obtaining and processing high-quality clinical data records has been a longstanding challenge, ranging from data quantity, annotation, patient privacy, and ethics. In this survey, we investigate a wide range of data-centric approaches in the FM era (from model pre-training to inference) towards improving the healthcare workflow. We discuss key perspectives in AI security, assessment, and alignment with human values. Finally, we offer a promising outlook of FM-based analytics to enhance the performance of patient outcome and clinical workflow in the evolving landscape of healthcare and medicine. We provide an up-to-date list of healthcare-related foundation models and datasets at https://github.com/Yunkun-Zhang/Data-Centric-FM-Healthcare .

Data-Centric Foundation Models in Computational Healthcare: A Survey

TL;DR

A wide range of data-centric approaches in the FM era (from model pre-training to inference) towards improving the healthcare workflow are investigated and a promising outlook of FM-based analytics to enhance the performance of patient outcome and clinical workflow in the evolving landscape of healthcare and medicine is offered.

Abstract

The advent of foundation models (FMs) as an emerging suite of AI techniques has struck a wave of opportunities in computational healthcare. The interactive nature of these models, guided by pre-training data and human instructions, has ignited a data-centric AI paradigm that emphasizes better data characterization, quality, and scale. In healthcare AI, obtaining and processing high-quality clinical data records has been a longstanding challenge, ranging from data quantity, annotation, patient privacy, and ethics. In this survey, we investigate a wide range of data-centric approaches in the FM era (from model pre-training to inference) towards improving the healthcare workflow. We discuss key perspectives in AI security, assessment, and alignment with human values. Finally, we offer a promising outlook of FM-based analytics to enhance the performance of patient outcome and clinical workflow in the evolving landscape of healthcare and medicine. We provide an up-to-date list of healthcare-related foundation models and datasets at https://github.com/Yunkun-Zhang/Data-Centric-FM-Healthcare .
Paper Structure (39 sections, 6 figures, 2 tables)

This paper contains 39 sections, 6 figures, 2 tables.

Figures (6)

  • Figure 1: Data-centric foundation models in computational healthcare.
  • Figure 2: An overview of healthcare data challenges and foundation model-based approaches mentioned in this survey paper.
  • Figure 3: Foundation model (FM) in healthcare.
  • Figure 4: Multi-modal fusion of healthcare data in the FM era. Conventional fusion approaches are enhanced by joint-modal pre-training and comprehensive FMs such as LLMs, enabling downstream applications such as medical QA, drug discovery, and diagnosis.
  • Figure 5: Foundation models address data quantity and data annotation challenges. Left: Foundation models can mitigate data quantity limitation by data augmentation and improved data efficiency. Right: Foundation models can help both healthcare text and medical image annotation.
  • ...and 1 more figures