Audio-based Step-count Estimation for Running -- Windowing and Neural Network Baselines
Philipp Wagner, Andreas Triantafyllopoulos, Alexander Gebhard, Björn Schuller
TL;DR
The paper tackles audio-based estimation of running step counts by regressing the number of steps within fixed windows from Mel-spectrogram representations. Evaluating a wide range of architectures—including CNNs, PANNs, and Transformer-based models—on the KIRun dataset with IMU-derived ground truth, it reports a peak result of $MAE=1.098$ and $PCC=0.479$ for 5-second windows. Ablation studies show mixed effects for data augmentation and transfer learning, with ImageNet pretraining often helping and windowing strategies significantly impacting performance, albeit sometimes leveraging data leakage. Overall, the work demonstrates the feasibility of audio-based running monitoring and lays groundwork for future developments in step-level detection and related tasks like fatigue and surface-type classification.
Abstract
In recent decades, running has become an increasingly popular pastime activity due to its accessibility, ease of practice, and anticipated health benefits. However, the risk of running-related injuries is substantial for runners of different experience levels. Several common forms of injuries result from overuse -- extending beyond the recommended running time and intensity. Recently, audio-based tracking has emerged as yet another modality for monitoring running behaviour and performance, with previous studies largely concentrating on predicting runner fatigue. In this work, we investigate audio-based step count estimation during outdoor running, achieving a mean absolute error of 1.098 in window-based step-count differences and a Pearson correlation coefficient of 0.479 when predicting the number of steps in a 5-second window of audio. Our work thus showcases the feasibility of audio-based monitoring for estimating important physiological variables and lays the foundations for further utilising audio sensors for a more thorough characterisation of runner behaviour.
