CADICA: a new dataset for coronary artery disease detection by using invasive coronary angiography
Ariadna Jiménez-Partinen, Miguel A. Molina-Cabello, Karl Thurnhofer-Hemsi, Esteban J. Palomo, Jorge Rodríguez-Capitán, Ana I. Molina-Ramos, Manuel Jiménez-Navarro
TL;DR
This work introduces CADICA, a publicly accessible dataset of invasive coronary angiography (ICA) videos with bounding-box lesion annotations and rich metadata, addressing the lack of open data for CAD assessment and improving reproducibility in CAD detection research. The dataset comprises 668 ICA videos from 42 patients (382 selected for classification), with detailed per-frame annotations and multi-artery views (left and right coronary arteries). The authors benchmark five pretrained CNN architectures (MobileNet-V2, ResNet-18/50, NasNet-Mobile, DenseNet-201) on a binary lesion/non-lesion task, employing data augmentation and cross-validation (5-fold on LCA and 10-fold overall) to report accuracy, F-measure, and balanced accuracy, identifying MobileNet-V2 (LCA) and NasNet-Mobile (RCA) as strong performers and ResNet-50 achieving top F-measures. CADICA provides a practical open resource for developing and evaluating computer-aided diagnostic tools for CAD detection in ICA, with future work aimed at improving performance via severe-lesion focus and patch-based methods.
Abstract
Coronary artery disease (CAD) remains the leading cause of death globally and invasive coronary angiography (ICA) is considered the gold standard of anatomical imaging evaluation when CAD is suspected. However, risk evaluation based on ICA has several limitations, such as visual assessment of stenosis severity, which has significant interobserver variability. This motivates to development of a lesion classification system that can support specialists in their clinical procedures. Although deep learning classification methods are well-developed in other areas of medical imaging, ICA image classification is still at an early stage. One of the most important reasons is the lack of available and high-quality open-access datasets. In this paper, we reported a new annotated ICA images dataset, CADICA, to provide the research community with a comprehensive and rigorous dataset of coronary angiography consisting of a set of acquired patient videos and associated disease-related metadata. This dataset can be used by clinicians to train their skills in angiographic assessment of CAD severity and by computer scientists to create computer-aided diagnostic systems to help in such assessment. In addition, baseline classification methods are proposed and analyzed, validating the functionality of CADICA and giving the scientific community a starting point to improve CAD detection.
