Table of Contents
Fetching ...

Design Space Exploration on Efficient and Accurate Human Pose Estimation from Sparse IMU-Sensing

Iris Fürst-Walter, Antonio Nappi, Tanja Harbaum, Jürgen Becker

TL;DR

This work addresses the problem of privacy-preserving, energy-efficient human pose estimation by exploring the design space of sparse IMU sensing. It introduces a simulative Design Space Exploration (DSE) that synthesizes IMU data from a body-model dataset, trains a deep learning estimator for thousands of sensor configurations, and evaluates them with a unified accuracy-resource metric. A key contribution is the combined metric $M_i(\lambda) = e_i(1-\lambda) + \lambda i$, which balances pose accuracy (e.g., mesh error) against hardware costs (sensor count). The results identify a four-sensor configuration (pelvis, sternum, and elbows) that achieves a mesh error of 6.03 cm and reduces sensor count by two compared to the state of the art, demonstrating a strong accuracy-resource trade-off for practical health applications. The method lays groundwork for privacy-aware, resource-conscious design of fabric-integrated HPE systems and can guide deployment across rehabilitation and sports domains.

Abstract

Human Pose Estimation (HPE) to assess human motion in sports, rehabilitation or work safety requires accurate sensing without compromising the sensitive underlying personal data. Therefore, local processing is necessary and the limited energy budget in such systems can be addressed by Inertial Measurement Units (IMU) instead of common camera sensing. The central trade-off between accuracy and efficient use of hardware resources is rarely discussed in research. We address this trade-off by a simulative Design Space Exploration (DSE) of a varying quantity and positioning of IMU-sensors. First, we generate IMU-data from a publicly available body model dataset for different sensor configurations and train a deep learning model with this data. Additionally, we propose a combined metric to assess the accuracy-resource trade-off. We used the DSE as a tool to evaluate sensor configurations and identify beneficial ones for a specific use case. Exemplary, for a system with equal importance of accuracy and resources, we identify an optimal sensor configuration of 4 sensors with a mesh error of 6.03 cm, increasing the accuracy by 32.7% and reducing the hardware effort by two sensors compared to state of the art. Our work can be used to design health applications with well-suited sensor positioning and attention to data privacy and resource-awareness.

Design Space Exploration on Efficient and Accurate Human Pose Estimation from Sparse IMU-Sensing

TL;DR

This work addresses the problem of privacy-preserving, energy-efficient human pose estimation by exploring the design space of sparse IMU sensing. It introduces a simulative Design Space Exploration (DSE) that synthesizes IMU data from a body-model dataset, trains a deep learning estimator for thousands of sensor configurations, and evaluates them with a unified accuracy-resource metric. A key contribution is the combined metric , which balances pose accuracy (e.g., mesh error) against hardware costs (sensor count). The results identify a four-sensor configuration (pelvis, sternum, and elbows) that achieves a mesh error of 6.03 cm and reduces sensor count by two compared to the state of the art, demonstrating a strong accuracy-resource trade-off for practical health applications. The method lays groundwork for privacy-aware, resource-conscious design of fabric-integrated HPE systems and can guide deployment across rehabilitation and sports domains.

Abstract

Human Pose Estimation (HPE) to assess human motion in sports, rehabilitation or work safety requires accurate sensing without compromising the sensitive underlying personal data. Therefore, local processing is necessary and the limited energy budget in such systems can be addressed by Inertial Measurement Units (IMU) instead of common camera sensing. The central trade-off between accuracy and efficient use of hardware resources is rarely discussed in research. We address this trade-off by a simulative Design Space Exploration (DSE) of a varying quantity and positioning of IMU-sensors. First, we generate IMU-data from a publicly available body model dataset for different sensor configurations and train a deep learning model with this data. Additionally, we propose a combined metric to assess the accuracy-resource trade-off. We used the DSE as a tool to evaluate sensor configurations and identify beneficial ones for a specific use case. Exemplary, for a system with equal importance of accuracy and resources, we identify an optimal sensor configuration of 4 sensors with a mesh error of 6.03 cm, increasing the accuracy by 32.7% and reducing the hardware effort by two sensors compared to state of the art. Our work can be used to design health applications with well-suited sensor positioning and attention to data privacy and resource-awareness.
Paper Structure (8 sections, 1 equation, 6 figures, 3 tables)

This paper contains 8 sections, 1 equation, 6 figures, 3 tables.

Figures (6)

  • Figure 1: Methodology of our Design Space Exploration. We define a basic sensor configuration and synthesize IMU-data for sensor subsets. With this data, we train a deep neural network and evaluate diverse error metrics for of each subset. Finally, we analyse all experiments to identify a beneficial sensor positioning.
  • Figure 2: Segments and joint positions of Loper2015. (a) Segmentation of , where the white dots depict joints of the body model detailed in (b), shaded joints are not considered in Huang2018 and equally not in ours. (b) corresponds to the back view of the model, i.e., left side of (a)
  • Figure 3: Basic sensor configuration illustrating all possible sensor positions considered for the .
  • Figure 4: Accuracy of the sensor configurations for different numbers of sensors.
  • Figure 5: Combined metric $M_i$ illustrating the accuracy-resource trade-off on varying hardware weight $\lambda$ for most accurate sensor configuration of $i$ sensors. The jitter is scaled by 0.1 to ensure the error metric being at the same order of magnitude as the number of sensors. The vertical blue line indicates a design with equal weight on prediction performance and hardware costs ($\lambda = 50%$).
  • ...and 1 more figures