Table of Contents
Fetching ...

VISION: Toward a Standardized Process for Radiology Image Management at the National Level

Kathryn Knight, Ioana Danciu, Olga Ovchinnikova, Jacob Hinkle, Mayanka Chandra Shekar, Debangshu Mukherjee, Eileen McAllister, Caitlin Rizy, Kelly Cho, Amy C. Justice, Joseph Erdos, Peter Kuzmak, Lauren Costa, Yuk-Lam Ho, Reddy Madipadga, Suzanne Tamang, Ian Goethert

TL;DR

This work documents the VISION pilot, which standardizes national-level radiology image management by securely transferring over 1 million images (chest X-ray and MRI) from VA clinical repositories to ORNL's research environment and linking them to the VA CDW. It details a batch-oriented ETL workflow, DICOM header parsing with pydicom, and metadata extraction that yields a research-ready data space for multimodal analyses, while explicitly addressing data quality, provenance, and governance challenges. Key insights highlight the centrality of upfront metadata standardization, legacy data reconciliation, and non-disruptive integration with clinical workflows as prerequisites for scalable imaging data commons. The practical impact lies in guiding future efforts to automate large-scale imaging ingestion across institutions, enabling robust AI research on linked radiology and EHR data while maintaining privacy, security, and data integrity.

Abstract

The compilation and analysis of radiological images poses numerous challenges for researchers. The sheer volume of data as well as the computational needs of algorithms capable of operating on images are extensive. Additionally, the assembly of these images alone is difficult, as these exams may differ widely in terms of clinical context, structured annotation available for model training, modality, and patient identifiers. In this paper, we describe our experiences and challenges in establishing a trusted collection of radiology images linked to the United States Department of Veterans Affairs (VA) electronic health record database. We also discuss implications in making this repository research-ready for medical investigators. Key insights include uncovering the specific procedures required for transferring images from a clinical to a research-ready environment, as well as roadblocks and bottlenecks in this process that may hinder future efforts at automation.

VISION: Toward a Standardized Process for Radiology Image Management at the National Level

TL;DR

This work documents the VISION pilot, which standardizes national-level radiology image management by securely transferring over 1 million images (chest X-ray and MRI) from VA clinical repositories to ORNL's research environment and linking them to the VA CDW. It details a batch-oriented ETL workflow, DICOM header parsing with pydicom, and metadata extraction that yields a research-ready data space for multimodal analyses, while explicitly addressing data quality, provenance, and governance challenges. Key insights highlight the centrality of upfront metadata standardization, legacy data reconciliation, and non-disruptive integration with clinical workflows as prerequisites for scalable imaging data commons. The practical impact lies in guiding future efforts to automate large-scale imaging ingestion across institutions, enabling robust AI research on linked radiology and EHR data while maintaining privacy, security, and data integrity.

Abstract

The compilation and analysis of radiological images poses numerous challenges for researchers. The sheer volume of data as well as the computational needs of algorithms capable of operating on images are extensive. Additionally, the assembly of these images alone is difficult, as these exams may differ widely in terms of clinical context, structured annotation available for model training, modality, and patient identifiers. In this paper, we describe our experiences and challenges in establishing a trusted collection of radiology images linked to the United States Department of Veterans Affairs (VA) electronic health record database. We also discuss implications in making this repository research-ready for medical investigators. Key insights include uncovering the specific procedures required for transferring images from a clinical to a research-ready environment, as well as roadblocks and bottlenecks in this process that may hinder future efforts at automation.
Paper Structure (14 sections, 1 figure, 2 algorithms)