Table of Contents
Fetching ...

A Retrospective of the Tutorial on Opportunities and Challenges of Online Deep Learning

Cedric Kulbach, Lucas Cazzonelli, Hoang-Anh Ngo, Minh-Huong Le-Nguyen, Albert Bifet

TL;DR

The paper surveys online deep learning in streaming contexts, highlighting opportunities and pitfalls. It details River as a mature online-learning library and introduces Deep-River, a PyTorch-backed bridge to enable neural models on data streams. It presents an anomaly-detection demonstration showing that a Deep-River autoencoder can outperform a conventional online detector, and discusses practical constraints such as limited GPU benefits for per-sample online processing. The work provides practical guidance and tooling to advance online deep learning in real-time systems.

Abstract

Machine learning algorithms have become indispensable in today's world. They support and accelerate the way we make decisions based on the data at hand. This acceleration means that data structures that were valid at one moment could no longer be valid in the future. With these changing data structures, it is necessary to adapt machine learning (ML) systems incrementally to the new data. This is done with the use of online learning or continuous ML technologies. While deep learning technologies have shown exceptional performance on predefined datasets, they have not been widely applied to online, streaming, and continuous learning. In this retrospective of our tutorial titled Opportunities and Challenges of Online Deep Learning held at ECML PKDD 2023, we provide a brief overview of the opportunities but also the potential pitfalls for the application of neural networks in online learning environments using the frameworks River and Deep-River.

A Retrospective of the Tutorial on Opportunities and Challenges of Online Deep Learning

TL;DR

The paper surveys online deep learning in streaming contexts, highlighting opportunities and pitfalls. It details River as a mature online-learning library and introduces Deep-River, a PyTorch-backed bridge to enable neural models on data streams. It presents an anomaly-detection demonstration showing that a Deep-River autoencoder can outperform a conventional online detector, and discusses practical constraints such as limited GPU benefits for per-sample online processing. The work provides practical guidance and tooling to advance online deep learning in real-time systems.

Abstract

Machine learning algorithms have become indispensable in today's world. They support and accelerate the way we make decisions based on the data at hand. This acceleration means that data structures that were valid at one moment could no longer be valid in the future. With these changing data structures, it is necessary to adapt machine learning (ML) systems incrementally to the new data. This is done with the use of online learning or continuous ML technologies. While deep learning technologies have shown exceptional performance on predefined datasets, they have not been widely applied to online, streaming, and continuous learning. In this retrospective of our tutorial titled Opportunities and Challenges of Online Deep Learning held at ECML PKDD 2023, we provide a brief overview of the opportunities but also the potential pitfalls for the application of neural networks in online learning environments using the frameworks River and Deep-River.
Paper Structure (11 sections, 4 figures)

This paper contains 11 sections, 4 figures.

Figures (4)

  • Figure 1: Structure of the interaction between data stream and prediction model (adapted from montiel2019).
  • Figure 2: Anomaly scores and decision boundaries of autoencoder- and Half-Space Trees anomaly detectors for stream consisting of credit card transactions dalpozzoloLearnedLessonsCredit2014b. Anomalies are shown in red.
  • Figure 3: Results of prequential evaluations run with different MLP architectures on the first 10,000 samples of the Insects abrupt classification dataset souzaChallengesBenchmarkingStream2020d.
  • Figure 4: Capture of live visualization demo of ExtremelyFastDecisionTreeClassifier and Hoeffding Tree Classifier with Accuracy and Cohen-Kappa metrics.