Acoustical Features as Knee Health Biomarkers: A Critical Analysis
Christodoulos Kechris, Jerome Thevenot, Tomas Teijeiro, Vincent A. Stadelmann, Nicola A. Maffiuletti, David Atienza
TL;DR
This work critiques the use of acoustical knee signals as biomarkers by introducing a formal causal framework that separates knee-origin information from external sources. It formalizes the relationship among knee health $H$, ideal vibrations $V$, observed signals $\tilde{V}$, features $X=g(\tilde{V})$, and predictions $Y=f(X)$, emphasizing the need for $\tilde{V} \approx V$ and consideration of a Bias Introduction Pathway driven by external sources. Through three real-world studies, it shows that high classification accuracy can be driven by expectations, protocols, or hardware biases rather than true knee pathology, including a 96% LOSO accuracy in a counterfactual, a 33 kHz interference in a public dataset, and device-induced bias that can entirely account for health discrimination. The findings argue for rigorous causal attribution, environmental and device controls, and cautious labeling grounded in clinical information to reliably validate acoustical biomarkers for knee health.
Abstract
Acoustical knee health assessment has long promised an alternative to clinically available medical imaging tools, but this modality has yet to be adopted in medical practice. The field is currently led by machine learning models processing acoustical features, which have presented promising diagnostic performances. However, these methods overlook the intricate multi-source nature of audio signals and the underlying mechanisms at play. By addressing this critical gap, the present paper introduces a novel causal framework for validating knee acoustical features. We argue that current machine learning methodologies for acoustical knee diagnosis lack the required assurances and thus cannot be used to classify acoustic features as biomarkers. Our framework establishes a set of essential theoretical guarantees necessary to validate this claim. We apply our methodology to three real-world experiments investigating the effect of researchers' expectations, the experimental protocol and the wearable employed sensor. This investigation reveals latent issues such as underlying shortcut learning and performance inflation. This study is the first independent result reproduction study in the field of acoustical knee health evaluation. We conclude with actionable insights from our findings, offering valuable guidance to navigate these crucial limitations in future research.
