Table of Contents
Fetching ...

Acoustical Features as Knee Health Biomarkers: A Critical Analysis

Christodoulos Kechris, Jerome Thevenot, Tomas Teijeiro, Vincent A. Stadelmann, Nicola A. Maffiuletti, David Atienza

TL;DR

This work critiques the use of acoustical knee signals as biomarkers by introducing a formal causal framework that separates knee-origin information from external sources. It formalizes the relationship among knee health $H$, ideal vibrations $V$, observed signals $\tilde{V}$, features $X=g(\tilde{V})$, and predictions $Y=f(X)$, emphasizing the need for $\tilde{V} \approx V$ and consideration of a Bias Introduction Pathway driven by external sources. Through three real-world studies, it shows that high classification accuracy can be driven by expectations, protocols, or hardware biases rather than true knee pathology, including a 96% LOSO accuracy in a counterfactual, a 33 kHz interference in a public dataset, and device-induced bias that can entirely account for health discrimination. The findings argue for rigorous causal attribution, environmental and device controls, and cautious labeling grounded in clinical information to reliably validate acoustical biomarkers for knee health.

Abstract

Acoustical knee health assessment has long promised an alternative to clinically available medical imaging tools, but this modality has yet to be adopted in medical practice. The field is currently led by machine learning models processing acoustical features, which have presented promising diagnostic performances. However, these methods overlook the intricate multi-source nature of audio signals and the underlying mechanisms at play. By addressing this critical gap, the present paper introduces a novel causal framework for validating knee acoustical features. We argue that current machine learning methodologies for acoustical knee diagnosis lack the required assurances and thus cannot be used to classify acoustic features as biomarkers. Our framework establishes a set of essential theoretical guarantees necessary to validate this claim. We apply our methodology to three real-world experiments investigating the effect of researchers' expectations, the experimental protocol and the wearable employed sensor. This investigation reveals latent issues such as underlying shortcut learning and performance inflation. This study is the first independent result reproduction study in the field of acoustical knee health evaluation. We conclude with actionable insights from our findings, offering valuable guidance to navigate these crucial limitations in future research.

Acoustical Features as Knee Health Biomarkers: A Critical Analysis

TL;DR

This work critiques the use of acoustical knee signals as biomarkers by introducing a formal causal framework that separates knee-origin information from external sources. It formalizes the relationship among knee health , ideal vibrations , observed signals , features , and predictions , emphasizing the need for and consideration of a Bias Introduction Pathway driven by external sources. Through three real-world studies, it shows that high classification accuracy can be driven by expectations, protocols, or hardware biases rather than true knee pathology, including a 96% LOSO accuracy in a counterfactual, a 33 kHz interference in a public dataset, and device-induced bias that can entirely account for health discrimination. The findings argue for rigorous causal attribution, environmental and device controls, and cautious labeling grounded in clinical information to reliably validate acoustical biomarkers for knee health.

Abstract

Acoustical knee health assessment has long promised an alternative to clinically available medical imaging tools, but this modality has yet to be adopted in medical practice. The field is currently led by machine learning models processing acoustical features, which have presented promising diagnostic performances. However, these methods overlook the intricate multi-source nature of audio signals and the underlying mechanisms at play. By addressing this critical gap, the present paper introduces a novel causal framework for validating knee acoustical features. We argue that current machine learning methodologies for acoustical knee diagnosis lack the required assurances and thus cannot be used to classify acoustic features as biomarkers. Our framework establishes a set of essential theoretical guarantees necessary to validate this claim. We apply our methodology to three real-world experiments investigating the effect of researchers' expectations, the experimental protocol and the wearable employed sensor. This investigation reveals latent issues such as underlying shortcut learning and performance inflation. This study is the first independent result reproduction study in the field of acoustical knee health evaluation. We conclude with actionable insights from our findings, offering valuable guidance to navigate these crucial limitations in future research.
Paper Structure (7 sections, 14 equations, 9 figures)

This paper contains 7 sections, 14 equations, 9 figures.

Figures (9)

  • Figure 1: Illustration of our proposed causal framework for investigating and validating acoustic knee biomarkers. (a) Multiple audio source may be present in the experimental environment, including the examined knee. (b) These source will potentially be picked up by the acoustical sensor and may influence the extracted acoustical features. c These external source may introduce bias in the final results, severely boosting the model's performance and leading to wrong conclusions.
  • Figure 2: Illustration of our counterfactual experiment. By changing the expectation on the output (asking what if) the same audio data are interpreted differently. This interpretation is not necessarily causally linked to the underlying knee mechanism we are trying to describe.
  • Figure 3: Separation of Healthy / Unhealthy subsets based on MFCC8 and MFCC11.
  • Figure 4: External 33kHz interference and its effect on the health classification task. Representative time-frequency representations of the audio recordings of a Healthy (a) and an Unhealthy (b) individual. The constant component interference characterizing the unhealthy samples is visible at around 33kHz. (c) The accuracy of the knee health classification as a function of the frequency range demonstrating the effect of external interference on the classification task.
  • Figure 5: Illustration of the process to assess the effect of the device on the health classification performance by conditioning on the device.
  • ...and 4 more figures