Table of Contents
Fetching ...

COVID-19 Diagnosis from Cough Acoustics using ConvNets and Data Augmentation

Saranga Kingkor Mahanta, Darsh Kaushik, Shubham Jain, Hoang Van Truong, Koushik Guha

TL;DR

A deep learning approach is presented to analyze the acoustic dataset provided in Track 1 of the DiCOVA 2021 Challenge containing cough sound recordings belonging to both COVID-19 positive and negative examples, and the use of Mel Frequency Cepstral Coefficients as the input features to the proposed model is proposed.

Abstract

With the periodic rise and fall of COVID-19 and countries being inflicted by its waves, an efficient, economic, and effortless diagnosis procedure for the virus has been the utmost need of the hour. COVID-19 positive individuals may even be asymptomatic making the diagnosis difficult, but amongst the infected subjects, the asymptomatic ones need not be entirely free of symptoms caused by the virus. They might not show any observable symptoms like the symptomatic subjects, but they may differ from uninfected ones in the way they cough. These differences in the coughing sounds are minute and indiscernible to the human ear, however, these can be captured using machine learning-based statistical models. In this paper, we present a deep learning approach to analyze the acoustic dataset provided in Track 1 of the DiCOVA 2021 Challenge containing cough sound recordings belonging to both COVID-19 positive and negative examples. To perform the classification on the sound recordings as belonging to a COVID-19 positive or negative examples, we propose a ConvNet model. Our model achieved an AUC score percentage of 72.23 on the blind test set provided by the same for an unbiased evaluation of the models. The ConvNet model incorporated with Data Augmentation further increased the AUC-ROC percentage from 72.23 to 87.07. It also outperformed the DiCOVA 2021 Challenge's baseline model by 23% thus, claiming the top position on the DiCOVA 2021 Challenge leaderboard. This paper proposes the use of Mel frequency cepstral coefficients as the feature input for the proposed model.

COVID-19 Diagnosis from Cough Acoustics using ConvNets and Data Augmentation

TL;DR

A deep learning approach is presented to analyze the acoustic dataset provided in Track 1 of the DiCOVA 2021 Challenge containing cough sound recordings belonging to both COVID-19 positive and negative examples, and the use of Mel Frequency Cepstral Coefficients as the input features to the proposed model is proposed.

Abstract

With the periodic rise and fall of COVID-19 and countries being inflicted by its waves, an efficient, economic, and effortless diagnosis procedure for the virus has been the utmost need of the hour. COVID-19 positive individuals may even be asymptomatic making the diagnosis difficult, but amongst the infected subjects, the asymptomatic ones need not be entirely free of symptoms caused by the virus. They might not show any observable symptoms like the symptomatic subjects, but they may differ from uninfected ones in the way they cough. These differences in the coughing sounds are minute and indiscernible to the human ear, however, these can be captured using machine learning-based statistical models. In this paper, we present a deep learning approach to analyze the acoustic dataset provided in Track 1 of the DiCOVA 2021 Challenge containing cough sound recordings belonging to both COVID-19 positive and negative examples. To perform the classification on the sound recordings as belonging to a COVID-19 positive or negative examples, we propose a ConvNet model. Our model achieved an AUC score percentage of 72.23 on the blind test set provided by the same for an unbiased evaluation of the models. The ConvNet model incorporated with Data Augmentation further increased the AUC-ROC percentage from 72.23 to 87.07. It also outperformed the DiCOVA 2021 Challenge's baseline model by 23% thus, claiming the top position on the DiCOVA 2021 Challenge leaderboard. This paper proposes the use of Mel frequency cepstral coefficients as the feature input for the proposed model.

Paper Structure

This paper contains 9 sections, 2 equations, 6 figures, 1 table.

Figures (6)

  • Figure 1: Extracting Cepstral Coefficients from an audio signal
  • Figure 2: Distribution of duration of audio samples in the dataset
  • Figure 3: Class distribution of the augmented dataset
  • Figure 4: Proposed CNN architecture
  • Figure 5: ROC curves depicting model performance on each fold
  • ...and 1 more figures