A Generalist Audio Foundation Model for Comprehensive Body Sound Auscultation

Pingjie Wang; Liudan Zhao; Zihan Zhao; Miao He; Xin Sun; Ya Zhang; Kun Sun; Yanfeng Wang; Yu Wang

A Generalist Audio Foundation Model for Comprehensive Body Sound Auscultation

Pingjie Wang, Liudan Zhao, Zihan Zhao, Miao He, Xin Sun, Ya Zhang, Kun Sun, Yanfeng Wang, Yu Wang

TL;DR

AuscultaBase introduces a generalist foundation model for body-sound auscultation by unifying heart, lung, and bowel sounds through large-scale, self-supervised pretraining on AuscultaCorpus. It is evaluated on AuscultaBench, a 16-task benchmark spanning abnormality detection and disease diagnosis, where AuscultaBase consistently outperforms state-of-the-art baselines and demonstrates robustness across sound types and data imbalances. A clinical comparison with pediatric cardiology experts reveals higher sensitivity and strong accuracy, especially in younger patients, supporting its potential as a diagnostic assistant. The work provides a scalable framework and benchmark for AI-enabled auscultation, with open-source code and model checkpoints to foster further research and clinical translation.

Abstract

Accurate and efficient auscultation-based diagnostics are vital for early disease detection, especially in resource-limited settings where specialized clinical expertise is scarce. Traditional auscultation, which heavily depends on clinician experience, suffers from significant inter-observer variability, while existing AI models often falter due to the limitations of non-representative training data. In this study, we introduce AuscultaBase, a novel AI-driven diagnostic framework that harnesses self-supervised and contrastive learning techniques alongside large-scale, multi-source data integration to advance body sound analysis. By generating robust feature representations, AuscultaBase markedly enhances performance in abnormality detection, disease classification, and activity recognition tasks. Comprehensive evaluations on our newly established benchmark, AuscultaBench, demonstrate that AuscultaBase consistently outperforms state-of-the-art methods across key performance metrics, underscoring its potential as a scalable and cost-effective tool for clinical screening and early disease intervention. The code and model checkpoint has been released in https://github.com/applewpj/AuscultaBase.

A Generalist Audio Foundation Model for Comprehensive Body Sound Auscultation

TL;DR

Abstract

A Generalist Audio Foundation Model for Comprehensive Body Sound Auscultation

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (13)