Table of Contents
Fetching ...

StreamEnsemble: Predictive Queries over Spatiotemporal Streaming Data

Anderson Chaves, Eduardo Ogasawara, Patrick Valduriez, Fabio Porto

TL;DR

This work proposes StreamEnsemble, a novel approach to predictive queries over spatiotemporal stream data that dynamically selects and allocates Machine Learning models according to the underlying time series distributions and model characteristics.

Abstract

Predictive queries over spatiotemporal (ST) stream data pose significant data processing and analysis challenges. ST data streams involve a set of time series whose data distributions may vary in space and time, exhibiting multiple distinct patterns. In this context, assuming a single machine learning model would adequately handle such variations is likely to lead to failure. To address this challenge, we propose StreamEnsemble, a novel approach to predictive queries over ST data that dynamically selects and allocates Machine Learning models according to the underlying time series distributions and model characteristics. Our experimental evaluation reveals that this method markedly outperforms traditional ensemble methods and single model approaches in terms of accuracy and time, demonstrating a significant reduction in prediction error of more than 10 times compared to traditional approaches.

StreamEnsemble: Predictive Queries over Spatiotemporal Streaming Data

TL;DR

This work proposes StreamEnsemble, a novel approach to predictive queries over spatiotemporal stream data that dynamically selects and allocates Machine Learning models according to the underlying time series distributions and model characteristics.

Abstract

Predictive queries over spatiotemporal (ST) stream data pose significant data processing and analysis challenges. ST data streams involve a set of time series whose data distributions may vary in space and time, exhibiting multiple distinct patterns. In this context, assuming a single machine learning model would adequately handle such variations is likely to lead to failure. To address this challenge, we propose StreamEnsemble, a novel approach to predictive queries over ST data that dynamically selects and allocates Machine Learning models according to the underlying time series distributions and model characteristics. Our experimental evaluation reveals that this method markedly outperforms traditional ensemble methods and single model approaches in terms of accuracy and time, demonstrating a significant reduction in prediction error of more than 10 times compared to traditional approaches.
Paper Structure (29 sections, 1 equation, 9 figures, 2 tables, 1 algorithm)

This paper contains 29 sections, 1 equation, 9 figures, 2 tables, 1 algorithm.

Figures (9)

  • Figure 1: Two 20x20 time series grids composed of 16 regions. All time series within a region share a common pattern
  • Figure 2: Impact of Allocation Strategies on Prediction Accuracy, using different model allocation strategies.
  • Figure 3: ST Model Ensemble defined over query region
  • Figure 4: StreamEnsemble
  • Figure 5: Clusters identified by GLD and Parcorr with different numbers of basis vectors
  • ...and 4 more figures