Cardiovascular Disease Detection By Leveraging Semi-Supervised Learning
Shaohan Chen, Zheyan Liu, Huili Zheng, Qimin Zhang, Yiru Gong
TL;DR
This work tackles cardiovascular disease detection under label scarcity by applying semi-supervised learning to leverage large unlabeled BRFSS data. It benchmarks five semi-supervised methods (including Semi-Supervised SVM, Self-Training, Pseudo-Labeling, Mean Teacher, and Pi-Model) against five supervised baselines on a 75/25 train/test split, with the training data further split into labeled and unlabeled portions. Results show that semi-supervised approaches can achieve competitive or superior performance with fewer labeled examples, notably Self-Training achieving an AUC of approximately 0.8425 and F1 around 0.5175 at 50% labeling; AUCs remain above 0.75 for most models when labeling exceeds 30%. The findings highlight the practical value of semi-supervised learning for early CVD screening in clinical environments where labeling cost is high, and suggest future work on additional semi-supervised methods, unsupervised representation learning, graph-based techniques, and transfer-learning hybrids to further boost generalization.
Abstract
Cardiovascular disease (CVD) persists as a primary cause of death on a global scale, which requires more effective and timely detection methods. Traditional supervised learning approaches for CVD detection rely heavily on large-labeled datasets, which are often difficult to obtain. This paper employs semi-supervised learning models to boost efficiency and accuracy of CVD detection when there are few labeled samples. By leveraging both labeled and vast amounts of unlabeled data, our approach demonstrates improvements in prediction performance, while reducing the dependency on labeled data. Experimental results in a publicly available dataset show that semi-supervised models outperform traditional supervised learning techniques, providing an intriguing approach for the initial identification of cardiovascular disease within clinical environments.
