Advancements in Repetitive Action Counting: Joint-Based PoseRAC Model With Improved Performance
Haodong Chen, Ming C. Leu, Md Moniruzzaman, Zhaozheng Yin, Solmaz Hajmohammadi
TL;DR
This work tackles repetitive action counting (RepCount) by addressing limitations of RGB-frame and landmark-only approaches, notably viewpoint variability and miscounts. It extends PoseRAC by fusing five joint-angle features with pose landmarks, achieving a MAE of $0.211$ and an OBO of $0.599$ on the RepCount dataset, and demonstrates superior accuracy over the prior state-of-the-art in MAE. The method leverages pose saliency concepts and density-map visualization, using a Swin Transformer-based density map and an action-trigger mechanism to identify salient pose sequences across video frames. The approach yields robustness to camera angles, improves discrimination of sub-actions, and enhances salient-pose recognition, offering practical benefits for fitness tracking and rehabilitation contexts.
Abstract
Repetitive counting (RepCount) is critical in various applications, such as fitness tracking and rehabilitation. Previous methods have relied on the estimation of red-green-and-blue (RGB) frames and body pose landmarks to identify the number of action repetitions, but these methods suffer from a number of issues, including the inability to stably handle changes in camera viewpoints, over-counting, under-counting, difficulty in distinguishing between sub-actions, inaccuracy in recognizing salient poses, etc. In this paper, based on the work done by [1], we integrate joint angles with body pose landmarks to address these challenges and achieve better results than the state-of-the-art RepCount methods, with a Mean Absolute Error (MAE) of 0.211 and an Off-By-One (OBO) counting accuracy of 0.599 on the RepCount data set [2]. Comprehensive experimental results demonstrate the effectiveness and robustness of our method.
