Table of Contents
Fetching ...

Predicting Tuberculosis from Real-World Cough Audio Recordings and Metadata

George P. Kafentzis, Stephane Tetsing, Joe Brew, Lola Jover, Mindaugas Galvosas, Carlos Chaccour, Peter M. Small

TL;DR

Mobile phone-based applications that integrate clinical symptoms and cough sound analysis could help community health workers and, most importantly, health service programs to improve TB case-finding efforts while reducing costs, which could substantially improve public health.

Abstract

Tuberculosis (TB) is an infectious disease caused by the bacterium Mycobacterium tuberculosis and primarily affects the lungs, as well as other body parts. TB is spread through the air when an infected person coughs, sneezes, or talks. Medical doctors diagnose TB in patients via clinical examinations and specialized tests. However, coughing is a common symptom of respiratory diseases such as TB. Literature suggests that cough sounds coming from different respiratory diseases can be distinguished by both medical doctors and computer algorithms. Therefore, cough recordings associated with patients with and without TB seems to be a reasonable avenue of investigation. In this work, we utilize a very large dataset of TB and non-TB cough audio recordings obtained from the south-east of Africa, India, and the south-east of Asia using a fully automated phone-based application (Hyfe), without manual annotation. We fit statistical classifiers based on spectral and time domain features with and without clinical metadata. A stratified grouped cross-validation approach shows that an average Area Under Curve (AUC) of approximately 0.70 $\pm$ 0.05 both for a cough-level and a participant-level classification can be achieved using cough sounds alone. The addition of demographic and clinical factors increases performance, resulting in an average AUC of approximately 0.81 $\pm$ 0.05. Our results suggest mobile phone-based applications that integrate clinical symptoms and cough sound analysis could help community health workers and, most importantly, health service programs to improve TB case-finding efforts while reducing costs, which could substantially improve public health.

Predicting Tuberculosis from Real-World Cough Audio Recordings and Metadata

TL;DR

Mobile phone-based applications that integrate clinical symptoms and cough sound analysis could help community health workers and, most importantly, health service programs to improve TB case-finding efforts while reducing costs, which could substantially improve public health.

Abstract

Tuberculosis (TB) is an infectious disease caused by the bacterium Mycobacterium tuberculosis and primarily affects the lungs, as well as other body parts. TB is spread through the air when an infected person coughs, sneezes, or talks. Medical doctors diagnose TB in patients via clinical examinations and specialized tests. However, coughing is a common symptom of respiratory diseases such as TB. Literature suggests that cough sounds coming from different respiratory diseases can be distinguished by both medical doctors and computer algorithms. Therefore, cough recordings associated with patients with and without TB seems to be a reasonable avenue of investigation. In this work, we utilize a very large dataset of TB and non-TB cough audio recordings obtained from the south-east of Africa, India, and the south-east of Asia using a fully automated phone-based application (Hyfe), without manual annotation. We fit statistical classifiers based on spectral and time domain features with and without clinical metadata. A stratified grouped cross-validation approach shows that an average Area Under Curve (AUC) of approximately 0.70 0.05 both for a cough-level and a participant-level classification can be achieved using cough sounds alone. The addition of demographic and clinical factors increases performance, resulting in an average AUC of approximately 0.81 0.05. Our results suggest mobile phone-based applications that integrate clinical symptoms and cough sound analysis could help community health workers and, most importantly, health service programs to improve TB case-finding efforts while reducing costs, which could substantially improve public health.
Paper Structure (13 sections, 5 figures, 4 tables)

This paper contains 13 sections, 5 figures, 4 tables.

Figures (5)

  • Figure 1: Feature Extraction Pipeline.
  • Figure 2: Audio recordings, log-mel spectrograms, and mel-frequency cepstral coefficients for two cough sounds, one from a TB patient (upper three panels), and one from a healthy patient (lower three panels).
  • Figure 3: Best CNN architecture. $[MxN, L]$ under each Conv 2D block denote the filter size and the number of filters, respectively. $p=0.5$ under each Dropout block denotes the dropout probability while "size" and "strides" are parameters of the MaxPooling layer. Finally, numbers under Dense layers provide the number of neurons for each layer.
  • Figure 4: Cough-only experiment: boxplots of AUC values for all models. Left: per-cough assessment. Right: per-participant assessment. Red solid and green dashed lines denote mean and median value, respectively. For model abbreviations, see Table \ref{['tab:ML']}.
  • Figure 5: Cough+Metadata experiment: boxplots of AUC values for all models. Left: per-cough assessment. Right: per-participant assessment. Red solid and green dashed lines denote median and mean value, respectively. For model abbreviations, see Table \ref{['tab:ML']}.