INSPECT: A Multimodal Dataset for Pulmonary Embolism Diagnosis and Prognosis

Shih-Cheng Huang; Zepeng Huo; Ethan Steinberg; Chia-Chun Chiang; Matthew P. Lungren; Curtis P. Langlotz; Serena Yeung; Nigam H. Shah; Jason A. Fries

INSPECT: A Multimodal Dataset for Pulmonary Embolism Diagnosis and Prognosis

Shih-Cheng Huang, Zepeng Huo, Ethan Steinberg, Chia-Chun Chiang, Matthew P. Lungren, Curtis P. Langlotz, Serena Yeung, Nigam H. Shah, Jason A. Fries

TL;DR

The paper introduces INSPECT, a large-scale multimodal dataset for pulmonary embolism that combines 3D CT pulmonary angiography, radiology report impressions, and longitudinal EHR data from $19{,}402$ patients ($23{,}248$ CTPA studies). It defines eight PE-related diagnostic and prognostic tasks and provides baselines across imaging, EHR, and multimodal fusion, with open-source code and trained weights to enable reproducible evaluation. NLP-based labeling of PE from radiology impressions and careful de-identification under a Data Use Agreement enable public sharing while preserving privacy. Experimental results show imaging-based methods excel in PE diagnosis, EHR methods in prognosis, and fusion improves diagnostic performance but not prognosis, highlighting both the promise and the challenge of multimodal approaches for PE. INSPECT thus lays a foundation for future research in multimodal medical AI, providing a rich resource for benchmarking and method development.

Abstract

Synthesizing information from multiple data sources plays a crucial role in the practice of modern medicine. Current applications of artificial intelligence in medicine often focus on single-modality data due to a lack of publicly available, multimodal medical datasets. To address this limitation, we introduce INSPECT, which contains de-identified longitudinal records from a large cohort of patients at risk for pulmonary embolism (PE), along with ground truth labels for multiple outcomes. INSPECT contains data from 19,402 patients, including CT images, radiology report impression sections, and structured electronic health record (EHR) data (i.e. demographics, diagnoses, procedures, vitals, and medications). Using INSPECT, we develop and release a benchmark for evaluating several baseline modeling approaches on a variety of important PE related tasks. We evaluate image-only, EHR-only, and multimodal fusion models. Trained models and the de-identified dataset are made available for non-commercial use under a data use agreement. To the best of our knowledge, INSPECT is the largest multimodal dataset integrating 3D medical imaging and EHR for reproducible methods evaluation and research.

INSPECT: A Multimodal Dataset for Pulmonary Embolism Diagnosis and Prognosis

TL;DR

The paper introduces INSPECT, a large-scale multimodal dataset for pulmonary embolism that combines 3D CT pulmonary angiography, radiology report impressions, and longitudinal EHR data from

patients (

CTPA studies). It defines eight PE-related diagnostic and prognostic tasks and provides baselines across imaging, EHR, and multimodal fusion, with open-source code and trained weights to enable reproducible evaluation. NLP-based labeling of PE from radiology impressions and careful de-identification under a Data Use Agreement enable public sharing while preserving privacy. Experimental results show imaging-based methods excel in PE diagnosis, EHR methods in prognosis, and fusion improves diagnostic performance but not prognosis, highlighting both the promise and the challenge of multimodal approaches for PE. INSPECT thus lays a foundation for future research in multimodal medical AI, providing a rich resource for benchmarking and method development.

Abstract

Paper Structure (49 sections, 1 equation, 10 figures, 23 tables)

This paper contains 49 sections, 1 equation, 10 figures, 23 tables.

Introduction
Related Work
Medical AI for Pulmonary Embolism
Multimodal Fusion for Medical Image Applications
Multimodal Datasets
Cohort Definition & Dataset Composition
Benchmark
Data Processing
Task Definitions
Diagnostic Tasks
Prognostic Tasks
Baseline Models
Imaging
Structured EHRs
Multimodal Fusion
...and 34 more sections

Figures (10)

Figure 1: The INSPECT dataset comprises 19,402 patients' structured longitudinal EHRs, which includes diagnosis/procedure codes, labs, medications, vitals, and demographics, as well as 23,248 CT-scans paired with their corresponding radiology report impression section. We curated PE diagnostic and prognostic labels based on these radiology reports and subsequent visit data.
Figure 2: The cumulative probability distribution of EHR timeline lengths before and after CTPA.
Figure 3: Baseline Models. We evaluate both single modality models and multi-modal late fusion models that incorporate data from both images and EHRs as baselines. For CT input, we use an LRCN (Long-term Recurrent Convolutional) model, while for structured EHR input, we employ MOTOR and gradient-boosted trees. Our multi-modal fusion baseline utilizes a late fusion approach, learning a weighted mean from each individual modality's predicted probability.
Figure 4: A flowchart of our cohort definition process.
Figure 5: Patient timeline length distributions
...and 5 more figures

INSPECT: A Multimodal Dataset for Pulmonary Embolism Diagnosis and Prognosis

TL;DR

Abstract

INSPECT: A Multimodal Dataset for Pulmonary Embolism Diagnosis and Prognosis

Authors

TL;DR

Abstract

Table of Contents

Figures (10)