Synthetic Data in Radiological Imaging: Current State and Future Outlook

Elena Sizikova; Andreu Badal; Jana G. Delfino; Miguel Lago; Brandon Nelson; Niloufar Saharkhiz; Berkman Sahiner; Ghada Zamzmi; Aldo Badano

Synthetic Data in Radiological Imaging: Current State and Future Outlook

Elena Sizikova, Andreu Badal, Jana G. Delfino, Miguel Lago, Brandon Nelson, Niloufar Saharkhiz, Berkman Sahiner, Ghada Zamzmi, Aldo Badano

TL;DR

The paper surveys the role of synthetic data in radiological imaging to overcome data availability and privacy constraints for AI. It categorizes generation techniques into statistical, physical, and hybrid approaches, and discusses disease modeling and evaluation metrics. It highlights real-world applications including algorithm development, testing, in silico trials, and privacy-preserving data sharing, with exemplars like VICTRE and various synthetic datasets. It also addresses limitations, challenges in validation, and regulatory considerations, arguing that continued advances are needed to close the realism and governance gaps.

Abstract

A key challenge for the development and deployment of artificial intelligence (AI) solutions in radiology is solving the associated data limitations. Obtaining sufficient and representative patient datasets with appropriate annotations may be burdensome due to high acquisition cost, safety limitations, patient privacy restrictions or low disease prevalence rates. In silico data offers a number of potential advantages to patient data, such as diminished patient harm, reduced cost, simplified data acquisition, scalability, improved quality assurance testing, and a mitigation approach to data imbalances. We summarize key research trends and practical uses for synthetically generated data for radiological applications of AI. Specifically, we discuss different types of techniques for generating synthetic examples, their main application areas, and related quality control assessment issues. We also discuss current approaches for evaluating synthetic imaging data. Overall, synthetic data holds great promise in addressing current data availability gaps, but additional work is needed before its full potential is realized.

Synthetic Data in Radiological Imaging: Current State and Future Outlook

TL;DR

Abstract

Paper Structure (22 sections, 1 figure, 2 tables)

This paper contains 22 sections, 1 figure, 2 tables.

Introduction
Terminology
Techniques for Synthetic Data Generation
Statistical Generative Models
Physical Modeling
Digital Human Models
Digital Acquisition Device Models
Hybrid, Physics-Informed Models
Synthesizing Disease Models
Limitations of Data Generation Techniques
Applications
Algorithm Development and Training
Algorithm Testing
Patient Privacy Preservation
Addressing Bias and Other Limitations of Patient Datasets
...and 7 more sections

Figures (1)

Figure 1: Properties of the digital object and acquisition system models can be controlled during synthetic data generation process. Shown is the variation in imaging dose (number of Monte Carlo histories) generated with the VICTRE pipeline for digital mammography simulation Badano2018victre for a digital breast model graffNewOpensourceMultimodality2016 with fatty breast density and mass model de2015computational with 5 mm radius (adapted from sizikova2023knowledge).

Synthetic Data in Radiological Imaging: Current State and Future Outlook

TL;DR

Abstract

Synthetic Data in Radiological Imaging: Current State and Future Outlook

Authors

TL;DR

Abstract

Table of Contents

Figures (1)