On-device edge learning for IoT data streams: a survey
Afonso Lourenço, João Rodrigo, João Gama, Goreti Marreiros
TL;DR
The paper surveys on-device edge learning for IoT data streams, focusing on continual learning for neural networks and decision trees under resource constraints. It analyzes data architectures (batch vs stream), network capacity (cloud vs edge), and the resulting design implications for on-device training. It reviews neural-network approaches to mitigate forgetting and adaptivity, and DT approaches for incremental learning and memory efficiency, emphasizing open-world and tabular-data challenges. It argues that evaluation must be multi-criteria, including internal representations and forward/backward transfer, and discusses integration challenges and future directions such as federated learning. Overall, the survey highlights a trade-off between expressiveness and convergence, and calls for cohesive edge systems that balance stability-plasticity and efficient online learning.
Abstract
This literature review explores continual learning methods for on-device training in the context of neural networks (NNs) and decision trees (DTs) for classification tasks on smart environments. We highlight key constraints, such as data architecture (batch vs. stream) and network capacity (cloud vs. edge), which impact TinyML algorithm design, due to the uncontrolled natural arrival of data streams. The survey details the challenges of deploying deep learners on resource-constrained edge devices, including catastrophic forgetting, data inefficiency, and the difficulty of handling IoT tabular data in open-world settings. While decision trees are more memory-efficient for on-device training, they are limited in expressiveness, requiring dynamic adaptations, like pruning and meta-learning, to handle complex patterns and concept drifts. We emphasize the importance of multi-criteria performance evaluation tailored to edge applications, which assess both output-based and internal representation metrics. The key challenge lies in integrating these building blocks into autonomous online systems, taking into account stability-plasticity trade-offs, forward-backward transfer, and model convergence.
