Robust COVID-19 Detection from Cough Sounds using Deep Neural Decision Tree and Forest: A Comprehensive Cross-Datasets Evaluation
Rofiqul Islam, Nihad Karim Chowdhury, Muhammad Ashad Kabir
TL;DR
This work tackles COVID-19 detection from cough sounds using a robust DNDT and DNDF framework. It integrates RFECV feature selection, Bayesian optimization for hyper-parameters, SMOTE balancing, and ROC-AUC-based threshold moving, validated across five large cough datasets (Cambridge, Coswara, COUGHVID, Virufy, NoCoCoDa) and a combined dataset. The study demonstrates strong performance (AUC up to 0.99 on some datasets) and highlights cross-dataset transfer challenges, showing that combining datasets improves generalizability. The results indicate a promising, non-invasive screening approach with potential for broad real-world impact, while also underscoring the importance of dataset diversity to mitigate biases. The methodology advances cough-based COVID-19 diagnostics by delivering a rigorous, cross-datasets evaluation and a robust, ensemble-based classifier architecture.
Abstract
This research presents a robust approach to classifying COVID-19 cough sounds using cutting-edge machine-learning techniques. Leveraging deep neural decision trees and deep neural decision forests, our methodology demonstrates consistent performance across diverse cough sound datasets. We begin with a comprehensive extraction of features to capture a wide range of audio features from individuals, whether COVID-19 positive or negative. To determine the most important features, we use recursive feature elimination along with cross-validation. Bayesian optimization fine-tunes hyper-parameters of deep neural decision tree and deep neural decision forest models. Additionally, we integrate the SMOTE during training to ensure a balanced representation of positive and negative data. Model performance refinement is achieved through threshold optimization, maximizing the ROC-AUC score. Our approach undergoes a comprehensive evaluation in five datasets: Cambridge, Coswara, COUGHVID, Virufy, and the combined Virufy with the NoCoCoDa dataset. Consistently outperforming state-of-the-art methods, our proposed approach yields notable AUC scores of 0.97, 0.98, 0.92, 0.93, 0.99, and 0.99 across the respective datasets. Merging all datasets into a combined dataset, our method, using a deep neural decision forest classifier, achieves an AUC of 0.97. Also, our study includes a comprehensive cross-datasets analysis, revealing demographic and geographic differences in the cough sounds associated with COVID-19. These differences highlight the challenges in transferring learned features across diverse datasets and underscore the potential benefits of dataset integration, improving generalizability and enhancing COVID-19 detection from audio signals.
