Table of Contents
Fetching ...

MAcPNN: Mutual Assisted Learning on Data Streams with Temporal Dependence

Federico Giannini, Emanuele Della Valle

TL;DR

This work proposes Mutual Ass Learning, a learning paradigm grounded on Vygotsky’s popular Sociocultural Theory of Cognitive Development that allows cPNNs for single data point predictions and applies quantization to reduce the memory footprint.

Abstract

Internet of Things (IoT) Analytics often involves applying machine learning (ML) models on data streams. In such scenarios, traditional ML paradigms face obstacles related to continuous learning while dealing with concept drifts, temporal dependence, and avoiding forgetting. Moreover, in IoT, different edge devices build up a network. When learning models on those devices, connecting them could be useful in improving performance and reusing others' knowledge. This work proposes Mutual Assisted Learning, a learning paradigm grounded on Vygotsky's popular Sociocultural Theory of Cognitive Development. Each device is autonomous and does not need a central orchestrator. Whenever it degrades its performance due to a concept drift, it asks for assistance from others and decides whether their knowledge is useful for solving the new problem. This way, the number of connections is drastically reduced compared to the classical Federated Learning approaches, where the devices communicate at each training round. Every device is equipped with a Continuous Progressive Neural Network (cPNN) to handle the dynamic nature of data streams. We call this implementation Mutual Assisted cPNN (MAcPNN). To implement it, we allow cPNNs for single data point predictions and apply quantization to reduce the memory footprint. Experimental results prove the effectiveness of MAcPNN in boosting performance on synthetic and real data streams.

MAcPNN: Mutual Assisted Learning on Data Streams with Temporal Dependence

TL;DR

This work proposes Mutual Ass Learning, a learning paradigm grounded on Vygotsky’s popular Sociocultural Theory of Cognitive Development that allows cPNNs for single data point predictions and applies quantization to reduce the memory footprint.

Abstract

Internet of Things (IoT) Analytics often involves applying machine learning (ML) models on data streams. In such scenarios, traditional ML paradigms face obstacles related to continuous learning while dealing with concept drifts, temporal dependence, and avoiding forgetting. Moreover, in IoT, different edge devices build up a network. When learning models on those devices, connecting them could be useful in improving performance and reusing others' knowledge. This work proposes Mutual Assisted Learning, a learning paradigm grounded on Vygotsky's popular Sociocultural Theory of Cognitive Development. Each device is autonomous and does not need a central orchestrator. Whenever it degrades its performance due to a concept drift, it asks for assistance from others and decides whether their knowledge is useful for solving the new problem. This way, the number of connections is drastically reduced compared to the classical Federated Learning approaches, where the devices communicate at each training round. Every device is equipped with a Continuous Progressive Neural Network (cPNN) to handle the dynamic nature of data streams. We call this implementation Mutual Assisted cPNN (MAcPNN). To implement it, we allow cPNNs for single data point predictions and apply quantization to reduce the memory footprint. Experimental results prove the effectiveness of MAcPNN in boosting performance on synthetic and real data streams.
Paper Structure (11 sections, 9 equations, 7 figures, 2 tables, 1 algorithm)

This paper contains 11 sections, 9 equations, 7 figures, 2 tables, 1 algorithm.

Figures (7)

  • Figure 1: Size of cPNN and QcPNN (\ref{['fig:quantization_1']}) and QcPNN Compression Ratio (\ref{['fig:quantization_2']}) considering data points with 10 features. Models are implemented using PyTorch, an LSTM hidden size of 50, and a window size of 10. INT8 quantization is applied.
  • Figure 2: MAL paradigm. When $U_1$ detects a drift on its data stream $DS_1$, it asks for assistance from the other devices. $U_2$ and $U_3$ send to $U_1$ copies of their local models. $U_1$ builds an ensemble containing its local models and copies of the others. $U_1$ adds a column to all the models. Grey represents the frozen columns, while red represents the trainable columns.
  • Figure 3: Concept organization within the data streams of three users' devices. We combine five classification functions with abrupt drifts: some concepts are seen by others, and some are new. We boldly highlight concepts seen by other devices. Drifts are unsynchronized across streams.
  • Figure 4: An example of Cohen's Kappa evolution over time of Device 2 on a Weather configuration to highlight the points in which we observe start and end metrics. Scores are reset after each drift. The score at data point $d_t$ considers the predictions from the first data point following the last drift to $d_t$.
  • Figure 5: Critical distance diagram of the Nemenyi test ($\alpha=0.05$) on 100 real data streams (50 of Weather and 50 of AirQuality) on the different start$_j$ and start$_{avg}$ considering the Cohen's Kappa score. A is ARF$_T$, cL is cLSTM, M is MAcPNN, and cP is cPNN. MAcPNN is statistically the best model in all the cases. Applying MAcPNN boosts the adaptation to new concepts.
  • ...and 2 more figures