A Tiny Supervised ODL Core with Auto Data Pruning for Human Activity Recognition

Hiroki Matsutani; Radu Marculescu

A Tiny Supervised ODL Core with Auto Data Pruning for Human Activity Recognition

Hiroki Matsutani, Radu Marculescu

TL;DR

This work tackles the challenge of data drift in human activity recognition on edge devices by introducing a tiny supervised on-device learning core that automatically prunes data to minimize label queries from a nearby teacher. Building on OS-ELM, the approach uses lightweight weight storage and a hardware-friendly design, including two variants (ODLBase and ODLHash) that reduce memory while preserving accuracy. A key innovation is the auto data pruning mechanism, which adaptively tunes the confidence threshold to cut communication by up to 55.7% with only about 0.9% accuracy loss. The proposed ODLHash core demonstrates ultra-low power operation (~3.39 mW) and a compact footprint (~136 kB), enabling practical on-device continual learning for HAR under data drift.

Abstract

In this paper, we introduce a low-cost and low-power tiny supervised on-device learning (ODL) core that can address the distributional shift of input data for human activity recognition. Although ODL for resource-limited edge devices has been studied recently, how exactly to provide the training labels to these devices at runtime remains an open-issue. To address this problem, we propose to combine an automatic data pruning with supervised ODL to reduce the number queries needed to acquire predicted labels from a nearby teacher device and thus save power consumption during model retraining. The data pruning threshold is automatically tuned, eliminating a manual threshold tuning. As a tinyML solution at a few mW for the human activity recognition, we design a supervised ODL core that supports our automatic data pruning using a 45nm CMOS process technology. We show that the required memory size for the core is smaller than the same-shaped multilayer perceptron (MLP) and the power consumption is only 3.39mW. Experiments using a human activity recognition dataset show that the proposed automatic data pruning reduces the communication volume by 55.7% and power consumption accordingly with only 0.9% accuracy loss.

A Tiny Supervised ODL Core with Auto Data Pruning for Human Activity Recognition

TL;DR

Abstract

Paper Structure (10 sections, 6 figures, 3 tables, 1 algorithm)

This paper contains 10 sections, 6 figures, 3 tables, 1 algorithm.

Introduction
Supervised ODL with Auto Pruning
ODL Algorithm
Label Acquisition with Auto Data Pruning
ODL Core Design
Evaluations
ODL Approaches vs. NoODL
Data Pruning with Different Thresholds
Power Savings: A Case Study
Conclusions

Figures (6)

Figure 1: 2-D visualization results of 6-class human activity recognition dataset that contains samples obtained from 30 human subjects Reyes12.
Figure 2: Proposed supervised ODL system. (a) illustrates the overall system consisting of a single teacher and multiple edge devices, where $\bm{x_i} \in \mathbb{R}^n$ and $t_i$ are $n$-dimensional input data and corresponding predicted label from teacher at time $i$. (b) illustrates the prediction algorithm at edge devices, where $n$, $N$, and $m$ are numbers of input, hidden, and output layer nodes. $\bm{\alpha} \in \mathbb{R}^{n \times N}$ denote weight parameters between input and hidden layers, and $\bm{\beta_{i-1}} \in \mathbb{R}^{N \times m}$ denote those between hidden and output layers trained at time $i-1$. $G_1$ and $G_2$ are activation functions of hidden and output layers. $\bm{H_i} \in \mathbb{R}^N$ and $\bm{O_i} \in \mathbb{R}^m$ are their outputs at time $i$. Particularly, $O_{i,j}$ denotes probability of $j$-th class, and $c_i$ is predicted class at time $i$. (c) illustrates our label acquisition algorithm, where $p_1$ and $p_2$ denote probabilities of top2 labels, and $\bm{y_i} \in \mathbb{R}^m$ is one-hot encoded label at time $i$. $\theta$ can be auto-tuned. (d) illustrates the sequential training algorithm, where $\bm{\beta_i} \in \mathbb{R}^{N \times m}$ denote new weights updated at time $i$. $\bm{P_i} \in \mathbb{R}^{N \times N}$ denote temporary values needed to compute $\bm{\beta_i}$.
Figure 3: Communication volume with different $\theta$.
Figure 4: Power consumption with different $\theta$.
Figure 5: ODL core layout.
...and 1 more figures

A Tiny Supervised ODL Core with Auto Data Pruning for Human Activity Recognition

TL;DR

Abstract

A Tiny Supervised ODL Core with Auto Data Pruning for Human Activity Recognition

Authors

TL;DR

Abstract

Table of Contents

Figures (6)