LSP-YOLO: A Lightweight Single-Stage Network for Sitting Posture Recognition on Embedded Devices
Nanjun Li, Ziyue Hao, Quanqiang Wang, Xuanyin Wang
TL;DR
This work tackles real-time sitting posture recognition on embedded edge devices by replacing traditional two-stage pipelines with a single-stage, end-to-end network, LSP-YOLO. It fuses keypoint estimation and posture classification through point convolution, aided by Light-C3k2, which combines Partial Convolution and SimAM attention to minimize computation. A 5,000-image dataset across six postures supports training, and the smallest model achieves 94.2% accuracy at 251 FPS with a 1.9 MB size, while edge deployment on SV830C+GC030A confirms practical performance. Overall, the method delivers a compact, high-speed, deployable solution that outperforms two-stage approaches in inference efficiency while maintaining competitive accuracy for posture monitoring and HCI applications.
Abstract
With the rise in sedentary behavior, health problems caused by poor sitting posture have drawn increasing attention. Most existing methods, whether using invasive sensors or computer vision, rely on two-stage pipelines, which result in high intrusiveness, intensive computation, and poor real-time performance on embedded edge devices. Inspired by YOLOv11-Pose, a lightweight single-stage network for sitting posture recognition on embedded edge devices termed LSP-YOLO was proposed. By integrating partial convolution(PConv) and Similarity-Aware Activation Module(SimAM), a lightweight module, Light-C3k2, was designed to reduce computational cost while maintaining feature extraction capability. In the recognition head, keypoints were directly mapped to posture classes through pointwise convolution, and intermediate supervision was employed to enable efficient fusion of pose estimation and classification. Furthermore, a dataset containing 5,000 images across six posture categories was constructed for model training and testing. The smallest trained model, LSP-YOLO-n, achieved 94.2% accuracy and 251 Fps on personal computer(PC) with a model size of only 1.9 MB. Meanwhile, real-time and high-accuracy inference under constrained computational resources was demonstrated on the SV830C + GC030A platform. The proposed approach is characterized by high efficiency, lightweight design and deployability, making it suitable for smart classrooms, rehabilitation, and human-computer interaction applications.
