Table of Contents
Fetching ...

GTA-Net: An IoT-Integrated 3D Human Pose Estimation System for Real-Time Adolescent Sports Posture Correction

Shizhe Yuan, Li Zhou

TL;DR

GTA-Net is proposed, an intelligent system for posture correction and real-time feedback in adolescent sports, integrated within an IoT-enabled environment, which enhances real-time posture correction and offers broad applications in intelligent sports and health management.

Abstract

With the advancement of artificial intelligence, 3D human pose estimation-based systems for sports training and posture correction have gained significant attention in adolescent sports. However, existing methods face challenges in handling complex movements, providing real-time feedback, and accommodating diverse postures, particularly with occlusions, rapid movements, and the resource constraints of Internet of Things (IoT) devices, making it difficult to balance accuracy and real-time performance. To address these issues, we propose GTA-Net, an intelligent system for posture correction and real-time feedback in adolescent sports, integrated within an IoT-enabled environment. This model enhances pose estimation in dynamic scenes by incorporating Graph Convolutional Networks (GCN), Temporal Convolutional Networks (TCN), and Hierarchical Attention mechanisms, achieving real-time correction through IoT devices. Experimental results show GTA-Net's superior performance on Human3.6M, HumanEva-I, and MPI-INF-3DHP datasets, with Mean Per Joint Position Error (MPJPE) values of 32.2mm, 15.0mm, and 48.0mm, respectively, significantly outperforming existing methods. The model also demonstrates strong robustness in complex scenarios, maintaining high accuracy even with occlusions and rapid movements. This system enhances real-time posture correction and offers broad applications in intelligent sports and health management.

GTA-Net: An IoT-Integrated 3D Human Pose Estimation System for Real-Time Adolescent Sports Posture Correction

TL;DR

GTA-Net is proposed, an intelligent system for posture correction and real-time feedback in adolescent sports, integrated within an IoT-enabled environment, which enhances real-time posture correction and offers broad applications in intelligent sports and health management.

Abstract

With the advancement of artificial intelligence, 3D human pose estimation-based systems for sports training and posture correction have gained significant attention in adolescent sports. However, existing methods face challenges in handling complex movements, providing real-time feedback, and accommodating diverse postures, particularly with occlusions, rapid movements, and the resource constraints of Internet of Things (IoT) devices, making it difficult to balance accuracy and real-time performance. To address these issues, we propose GTA-Net, an intelligent system for posture correction and real-time feedback in adolescent sports, integrated within an IoT-enabled environment. This model enhances pose estimation in dynamic scenes by incorporating Graph Convolutional Networks (GCN), Temporal Convolutional Networks (TCN), and Hierarchical Attention mechanisms, achieving real-time correction through IoT devices. Experimental results show GTA-Net's superior performance on Human3.6M, HumanEva-I, and MPI-INF-3DHP datasets, with Mean Per Joint Position Error (MPJPE) values of 32.2mm, 15.0mm, and 48.0mm, respectively, significantly outperforming existing methods. The model also demonstrates strong robustness in complex scenarios, maintaining high accuracy even with occlusions and rapid movements. This system enhances real-time posture correction and offers broad applications in intelligent sports and health management.

Paper Structure

This paper contains 24 sections, 15 equations, 7 figures, 5 tables.

Figures (7)

  • Figure 1: The IoT-based framework for 3D pose estimation and real-time feedback. The system collects posture data using motion capture sensors, video sensors, and depth cameras. This data is processed by a central unit via IoT technology, enabling real-time 3D pose estimation and feedback for posture correction.
  • Figure 2: The overall architecture of the proposed system, combining GCN, TCN, and Hierarchical Attention mechanisms.The system utilizes Joint-GCN and Bone-GCN to capture local and global spatial relationships, while the Hierarchical Attention-augmented TCN refines temporal dynamics, enabling accurate 3D pose estimation in dynamic sports scenarios.
  • Figure 3: The structure of the proposed GCN model: (a) illustrates the basic Graph Convolutional Network operation, while (b) presents the dual-stream architecture combining joint (J-Stream) and bone (B-Stream) information for enhanced 3D pose estimation. The integration of Joint-GCN and Bone-GCN modules improves the system's ability to capture both local and global spatial relationships, essential for accurate pose estimation.
  • Figure 4: The structure of the Attention-Augmented Temporal Convolutional Network.This network is responsible for capturing the temporal dynamics of human motion, with attention mechanisms refining temporal features to improve pose estimation in dynamic movements. The TCN applies causal and dilated convolutions to efficiently process time-series data, ensuring real-time performance.
  • Figure 5: The hierarchical attention network architecture. yang2016hierarchical. This architecture integrates Hierarchical Attention into the GCN-TCN framework, allowing the model to focus on critical features across different levels, from local joints to global skeletal structures. This multi-level attention ensures more precise and stable 3D pose estimation in complex motion scenarios.
  • ...and 2 more figures