On-Device Training Empowered Transfer Learning For Human Activity Recognition
Pixi Kang, Julian Moosmann, Sizhen Bian, Michele Magno
TL;DR
The paper tackles the challenge of user-induced concept drift in HAR by introducing on-device transfer learning (ODTL) that updates only the classifier on MCU-grade devices, thereby preserving privacy and reducing data transfer. It designs quantized, lightweight on-device training engines for STM32F7 and GAP9, and evaluates them on three sensor modalities (RecGym, QVAR, Ultra) to quantify UICD impact and personalization gains. Results show that ODTL yields accuracy improvements (RecGym +3.73%, QVAR +17.38%, Ultra +3.70%) and that GAP9 dramatically outperforms STM32F7 in both latency (≈20x) and energy (inference up to ≈120x, ODTL up to ≈280x), demonstrating the practicality of edge continual learning for HAR. The study underscores the potential of low-power parallel edge hardware to enable real-time, privacy-preserving personalized HAR on resource-constrained devices and outlines future directions for risk management and efficient use of user data.
Abstract
Human Activity Recognition (HAR) is an attractive topic to perceive human behavior and supplying assistive services. Besides the classical inertial unit and vision-based HAR methods, new sensing technologies, such as ultrasound and body-area electric fields, have emerged in HAR to enhance user experience and accommodate new application scenarios. As those sensors are often paired with AI for HAR, they frequently encounter challenges due to limited training data compared to the more widely IMU or vision-based HAR solutions. Additionally, user-induced concept drift (UICD) is common in such HAR scenarios. UICD is characterized by deviations in the sample distribution of new users from that of the training participants, leading to deteriorated recognition performance. This paper proposes an on-device transfer learning (ODTL) scheme tailored for energy- and resource-constrained IoT edge devices. Optimized on-device training engines are developed for two representative MCU-level edge computing platforms: STM32F756ZG and GAP9. Based on this, we evaluated the ODTL benefits in three HAR scenarios: body capacitance-based gym activity recognition, QVAR- and ultrasonic-based hand gesture recognition. We demonstrated an improvement of 3.73%, 17.38%, and 3.70% in the activity recognition accuracy, respectively. Besides this, we observed that the RISC-V-based GAP9 achieves 20x and 280x less latency and power consumption than STM32F7 MCU during the ODTL deployment, demonstrating the advantages of employing the latest low-power parallel computing devices for edge tasks.
