Table of Contents
Fetching ...

CycleGuardian: A Framework for Automatic RespiratorySound classification Based on Improved Deep clustering and Contrastive Learning

Yun Chu, Qiuhao Wang, Enze Zhou, Ling Fu, Qian Liu, Gang Zheng

TL;DR

CycleGuardian addresses automatic respiratory sound classification under limited labeled data and the need for mobile deployment. It introduces grouped spectrogram encoding with IDEC and group-mix contrastive learning in a lightweight 38MB network, optimized via a multi-objective loss. On the ICBHI2017 dataset without pretrained weights, it achieves Sp $82.06\%$, Se $44.47\%$, Score $63.26\%$, and demonstrates on-device deployment on Android. The results show that grouping spectrogram features plus the combination of deep clustering and contrastive learning improves discrimination between normal and abnormal sounds and among abnormal types, making on-device auscultation feasible.

Abstract

Auscultation plays a pivotal role in early respiratory and pulmonary disease diagnosis. Despite the emergence of deep learning-based methods for automatic respiratory sound classification post-Covid-19, limited datasets impede performance enhancement. Distinguishing between normal and abnormal respiratory sounds poses challenges due to the coexistence of normal respiratory components and noise components in both types. Moreover, different abnormal respiratory sounds exhibit similar anomalous features, hindering their differentiation. Besides, existing state-of-the-art models suffer from excessive parameter size, impeding deployment on resource-constrained mobile platforms. To address these issues, we design a lightweight network CycleGuardian and propose a framework based on an improved deep clustering and contrastive learning. We first generate a hybrid spectrogram for feature diversity and grouping spectrograms to facilitating intermittent abnormal sound capture.Then, CycleGuardian integrates a deep clustering module with a similarity-constrained clustering component to improve the ability to capture abnormal features and a contrastive learning module with group mixing for enhanced abnormal feature discernment. Multi-objective optimization enhances overall performance during training. In experiments we use the ICBHI2017 dataset, following the official split method and without any pre-trained weights, our method achieves Sp: 82.06 $\%$, Se: 44.47$\%$, and Score: 63.26$\%$ with a network model size of 38M, comparing to the current model, our method leads by nearly 7$\%$, achieving the current best performances. Additionally, we deploy the network on Android devices, showcasing a comprehensive intelligent respiratory sound auscultation system.

CycleGuardian: A Framework for Automatic RespiratorySound classification Based on Improved Deep clustering and Contrastive Learning

TL;DR

CycleGuardian addresses automatic respiratory sound classification under limited labeled data and the need for mobile deployment. It introduces grouped spectrogram encoding with IDEC and group-mix contrastive learning in a lightweight 38MB network, optimized via a multi-objective loss. On the ICBHI2017 dataset without pretrained weights, it achieves Sp , Se , Score , and demonstrates on-device deployment on Android. The results show that grouping spectrogram features plus the combination of deep clustering and contrastive learning improves discrimination between normal and abnormal sounds and among abnormal types, making on-device auscultation feasible.

Abstract

Auscultation plays a pivotal role in early respiratory and pulmonary disease diagnosis. Despite the emergence of deep learning-based methods for automatic respiratory sound classification post-Covid-19, limited datasets impede performance enhancement. Distinguishing between normal and abnormal respiratory sounds poses challenges due to the coexistence of normal respiratory components and noise components in both types. Moreover, different abnormal respiratory sounds exhibit similar anomalous features, hindering their differentiation. Besides, existing state-of-the-art models suffer from excessive parameter size, impeding deployment on resource-constrained mobile platforms. To address these issues, we design a lightweight network CycleGuardian and propose a framework based on an improved deep clustering and contrastive learning. We first generate a hybrid spectrogram for feature diversity and grouping spectrograms to facilitating intermittent abnormal sound capture.Then, CycleGuardian integrates a deep clustering module with a similarity-constrained clustering component to improve the ability to capture abnormal features and a contrastive learning module with group mixing for enhanced abnormal feature discernment. Multi-objective optimization enhances overall performance during training. In experiments we use the ICBHI2017 dataset, following the official split method and without any pre-trained weights, our method achieves Sp: 82.06 , Se: 44.47, and Score: 63.26 with a network model size of 38M, comparing to the current model, our method leads by nearly 7, achieving the current best performances. Additionally, we deploy the network on Android devices, showcasing a comprehensive intelligent respiratory sound auscultation system.

Paper Structure

This paper contains 31 sections, 11 equations, 14 figures, 10 tables.

Figures (14)

  • Figure 1: Flowchart of the automatic respiratory sounds classification framework. In stage 1, the obtained multichannel speech maps are grouped. In stage 2, feature encoding of each group of speech sounds is done by CycleGuardian network, and deep clustering and comparison learning is performed. In stage 3, joint optimisation of multiple objectives is performed.
  • Figure 2: The left part shows the GFE Unit in the network, the middle shows the idea of grouping and encoding the spectrogram, and the right part shows the CGL, LGL unit.
  • Figure 3: Group mix module: input group$_i$ into GFE Unit to get group feature g$_i$, and then use g$_i$ in group mix to generate new hybrid sample features for subsequent contrastive learning.
  • Figure 4: The Improved Deep Embedding Clustering (IDEC) Module: including the DEC Module and Cluster Projection Fusion (CPF) Module.
  • Figure 5: CycleGuardian network architecture: the red dashed box contains the main modules of the network, the GFE Unit, Group mix, DEC, and CPF module. The blue dotted box contains the loss terms to be optimised, the clustering loss based on similarity constraints, the comparison loss, and the classification loss.
  • ...and 9 more figures