Table of Contents
Fetching ...

Some variation of COBRA in sequential learning setup

Aryan Bhambu, Arabin Kumar Dey

TL;DR

This work extends the COBRA (Combined Regression Strategy) ensemble to a sequential, multivariate time-series setting by introducing Dynamic Proximity Ensemble (DPE) and Partition-Dynamic Proximity Ensemble (PaDPE). Both approaches build a frame-based representation of time-series data, train multiple machines, and combine predictions using consensus-based weights governed by a proximity threshold $\epsilon$ and a consensus parameter $\alpha$, with PaDPE partitioning the training data to enhance robustness. Bayesian optimisation (BOA) via Tree-based Parzen Estimators is employed to automatically tune hyperparameters, outperforming grid search across eight datasets that span cryptocurrency, stock indices, and short-term load forecasting. Empirical results show that DPE often achieves the best RMSE/MAPE and that BOA-driven configurations (especially in PaDPE) yield superior performance; Wilcoxon's tests confirm statistically significant differences among models. The study demonstrates strong potential for COBRA-based ensembles in dynamic, high-dimensional forecasting tasks and outlines directions for improved dynamic prediction and interval estimation.

Abstract

This research paper introduces innovative approaches for multivariate time series forecasting based on different variations of the combined regression strategy. We use specific data preprocessing techniques which makes a radical change in the behaviour of prediction. We compare the performance of the model based on two types of hyper-parameter tuning Bayesian optimisation (BO) and Usual Grid search. Our proposed methodologies outperform all state-of-the-art comparative models. We illustrate the methodologies through eight time series datasets from three categories: cryptocurrency, stock index, and short-term load forecasting.

Some variation of COBRA in sequential learning setup

TL;DR

This work extends the COBRA (Combined Regression Strategy) ensemble to a sequential, multivariate time-series setting by introducing Dynamic Proximity Ensemble (DPE) and Partition-Dynamic Proximity Ensemble (PaDPE). Both approaches build a frame-based representation of time-series data, train multiple machines, and combine predictions using consensus-based weights governed by a proximity threshold and a consensus parameter , with PaDPE partitioning the training data to enhance robustness. Bayesian optimisation (BOA) via Tree-based Parzen Estimators is employed to automatically tune hyperparameters, outperforming grid search across eight datasets that span cryptocurrency, stock indices, and short-term load forecasting. Empirical results show that DPE often achieves the best RMSE/MAPE and that BOA-driven configurations (especially in PaDPE) yield superior performance; Wilcoxon's tests confirm statistically significant differences among models. The study demonstrates strong potential for COBRA-based ensembles in dynamic, high-dimensional forecasting tasks and outlines directions for improved dynamic prediction and interval estimation.

Abstract

This research paper introduces innovative approaches for multivariate time series forecasting based on different variations of the combined regression strategy. We use specific data preprocessing techniques which makes a radical change in the behaviour of prediction. We compare the performance of the model based on two types of hyper-parameter tuning Bayesian optimisation (BO) and Usual Grid search. Our proposed methodologies outperform all state-of-the-art comparative models. We illustrate the methodologies through eight time series datasets from three categories: cryptocurrency, stock index, and short-term load forecasting.
Paper Structure (28 sections, 14 equations, 14 figures, 6 tables, 3 algorithms)

This paper contains 28 sections, 14 equations, 14 figures, 6 tables, 3 algorithms.

Figures (14)

  • Figure 1: The schematic representation of the Dynamic Proximity Ensemble (DPE) approach. It condenses the model into four key phases: fine-tuning of model candidates' hyper-parameters, configuring combinations via the Bayesian optimisation algorithm (BOA), retraining the model candidates, generating the ensemble output with the optimal combination configuration.
  • Figure 2: The schematic representation of the Partition-Dynamic Proximity Ensemble (PaDPE) approach. It condenses the model into four key phases: fine-tuning of model candidates' hyper-parameters, configuring combinations via the Bayesian optimisation algorithm (BOA), retraining the model candidates, generating the ensemble output with the optimal combination configuration.
  • Figure 3: The schematic flowchart of the proposed dynamic proximity ensemble forecasting framework
  • Figure 4: Short-term load forecast time series data for New South Wales (NSW) showing the train, validation, and test partitions of the raw data for total demand and RRP respectively.
  • Figure 5: Time series data for Bitcoin crypto-currency, illustrating the train, validation, and test partitions of the raw data for closing price and volume.
  • ...and 9 more figures