DINO-LG: A Task-Specific DINO Model for Coronary Calcium Scoring
Mahmut S. Gokmen, Caner Ozcan, Moneera N. Haque, Steve W. Leung, C. Seth Parker, W. Brent Seales, Cody Bumgardner
TL;DR
This work presents DINO-LG, a task-specific self-supervised learning framework that uses label-guided data augmentation to bias a Vision Transformer-based DINO model toward calcified regions in CT slices. A linear classifier operates on DINO-LG features to identify calcified slices, which then feed a UNET-based segmentation module to quantify coronary calcium via the Agatston score across RCA, LAD, LCA, and LCX. Across the COCA dataset, DINO-LG improves slice-level detection (sensitivity 0.89, specificity 0.90) over standard DINO (0.79, 0.77) and reduces FN/FP rates substantially, while the integrated system yields higher CAC-scoring accuracy (average 0.84) and better risk-category discrimination (sensitivity 0.86, specificity 0.97) compared with a standalone UNET. The approach reduces manual review needs and has potential to generalize to other ROI-focused clinical tasks by guiding foundational models with limited annotations.
Abstract
Coronary artery disease (CAD), one of the leading causes of mortality worldwide, necessitates effective risk assessment strategies, with coronary artery calcium (CAC) scoring via computed tomography (CT) being a key method for prevention. Traditional methods, primarily based on UNET architectures implemented on pre-built models, face challenges like the scarcity of annotated CT scans containing CAC and imbalanced datasets, leading to reduced performance in segmentation and scoring tasks. In this study, we address these limitations by incorporating the self-supervised learning (SSL) technique of DINO (self-distillation with no labels), which trains without requiring CAC-specific annotations, enhancing its robustness in generating distinct features. The DINO-LG model, which leverages label guidance to focus on calcified areas, achieves significant improvements, with a sensitivity of 89% and specificity of 90% for detecting CAC-containing CT slices, compared to the standard DINO model's sensitivity of 79% and specificity of 77%. Additionally, false-negative and false-positive rates are reduced by 49% and 59%, respectively, instilling greater confidence in clinicians when ruling out calcification in low-risk patients and minimizing unnecessary imaging reviews by radiologists. Further, CAC scoring and segmentation tasks are conducted using a basic UNET architecture, applied specifically to CT slices identified by the DINO-LG model as containing calcified areas. This targeted approach enhances CAC scoring accuracy by feeding the UNET model with relevant slices, significantly improving diagnostic precision, reducing both false positives and false negatives, and ultimately lowering overall healthcare costs by minimizing unnecessary tests and treatments, presenting a valuable advancement in CAD risk assessment.
