Some variation of COBRA in sequential learning setup

Aryan Bhambu; Arabin Kumar Dey

Some variation of COBRA in sequential learning setup

Aryan Bhambu, Arabin Kumar Dey

TL;DR

This work extends the COBRA (Combined Regression Strategy) ensemble to a sequential, multivariate time-series setting by introducing Dynamic Proximity Ensemble (DPE) and Partition-Dynamic Proximity Ensemble (PaDPE). Both approaches build a frame-based representation of time-series data, train multiple machines, and combine predictions using consensus-based weights governed by a proximity threshold $\epsilon$ and a consensus parameter $\alpha$, with PaDPE partitioning the training data to enhance robustness. Bayesian optimisation (BOA) via Tree-based Parzen Estimators is employed to automatically tune hyperparameters, outperforming grid search across eight datasets that span cryptocurrency, stock indices, and short-term load forecasting. Empirical results show that DPE often achieves the best RMSE/MAPE and that BOA-driven configurations (especially in PaDPE) yield superior performance; Wilcoxon's tests confirm statistically significant differences among models. The study demonstrates strong potential for COBRA-based ensembles in dynamic, high-dimensional forecasting tasks and outlines directions for improved dynamic prediction and interval estimation.

Abstract

This research paper introduces innovative approaches for multivariate time series forecasting based on different variations of the combined regression strategy. We use specific data preprocessing techniques which makes a radical change in the behaviour of prediction. We compare the performance of the model based on two types of hyper-parameter tuning Bayesian optimisation (BO) and Usual Grid search. Our proposed methodologies outperform all state-of-the-art comparative models. We illustrate the methodologies through eight time series datasets from three categories: cryptocurrency, stock index, and short-term load forecasting.

Some variation of COBRA in sequential learning setup

TL;DR

and a consensus parameter

, with PaDPE partitioning the training data to enhance robustness. Bayesian optimisation (BOA) via Tree-based Parzen Estimators is employed to automatically tune hyperparameters, outperforming grid search across eight datasets that span cryptocurrency, stock indices, and short-term load forecasting. Empirical results show that DPE often achieves the best RMSE/MAPE and that BOA-driven configurations (especially in PaDPE) yield superior performance; Wilcoxon's tests confirm statistically significant differences among models. The study demonstrates strong potential for COBRA-based ensembles in dynamic, high-dimensional forecasting tasks and outlines directions for improved dynamic prediction and interval estimation.

Abstract

Paper Structure (28 sections, 14 equations, 14 figures, 6 tables, 3 algorithms)

This paper contains 28 sections, 14 equations, 14 figures, 6 tables, 3 algorithms.

Introduction
Contributions of our proposed model
Proposed Methodologies
Dynamic Proximity based Ensemble (DPE)
Partition-Dynamic Proximity based Ensemble (PaDPE)
Training
Testing
DPE Methodology on Test dataset
PaDPE Methodology
Bayesian Optimisation for Hyperparameter Tuning
Empirical Study
Data and its nature
Assessment Metrics
Root Mean Square Error (RMSE)
Mean Absolute Percentage Error (MAPE)
...and 13 more sections

Figures (14)

Figure 1: The schematic representation of the Dynamic Proximity Ensemble (DPE) approach. It condenses the model into four key phases: fine-tuning of model candidates' hyper-parameters, configuring combinations via the Bayesian optimisation algorithm (BOA), retraining the model candidates, generating the ensemble output with the optimal combination configuration.
Figure 2: The schematic representation of the Partition-Dynamic Proximity Ensemble (PaDPE) approach. It condenses the model into four key phases: fine-tuning of model candidates' hyper-parameters, configuring combinations via the Bayesian optimisation algorithm (BOA), retraining the model candidates, generating the ensemble output with the optimal combination configuration.
Figure 3: The schematic flowchart of the proposed dynamic proximity ensemble forecasting framework
Figure 4: Short-term load forecast time series data for New South Wales (NSW) showing the train, validation, and test partitions of the raw data for total demand and RRP respectively.
Figure 5: Time series data for Bitcoin crypto-currency, illustrating the train, validation, and test partitions of the raw data for closing price and volume.
...and 9 more figures

Some variation of COBRA in sequential learning setup

TL;DR

Abstract

Some variation of COBRA in sequential learning setup

Authors

TL;DR

Abstract

Table of Contents

Figures (14)