Table of Contents
Fetching ...

HIVE-COTE 2.0: a new meta ensemble for time series classification

Matthew Middlehurst, James Large, Michael Flynn, Jason Lines, Aaron Bostrom, Anthony Bagnall

TL;DR

This work proposes comprehensive changes to the HIVE-COTE algorithm which significantly improve its accuracy and usability, and introduces two novel classifiers, the Temporal Dictionary Ensemble and Diverse Representation Canonical Interval Forest, which replace existing ensemble members.

Abstract

The Hierarchical Vote Collective of Transformation-based Ensembles (HIVE-COTE) is a heterogeneous meta ensemble for time series classification. HIVE-COTE forms its ensemble from classifiers of multiple domains, including phase-independent shapelets, bag-of-words based dictionaries and phase-dependent intervals. Since it was first proposed in 2016, the algorithm has remained state of the art for accuracy on the UCR time series classification archive. Over time it has been incrementally updated, culminating in its current state, HIVE-COTE 1.0. During this time a number of algorithms have been proposed which match the accuracy of HIVE-COTE. We propose comprehensive changes to the HIVE-COTE algorithm which significantly improve its accuracy and usability, presenting this upgrade as HIVE-COTE 2.0. We introduce two novel classifiers, the Temporal Dictionary Ensemble (TDE) and Diverse Representation Canonical Interval Forest (DrCIF), which replace existing ensemble members. Additionally, we introduce the Arsenal, an ensemble of ROCKET classifiers as a new HIVE-COTE 2.0 constituent. We demonstrate that HIVE-COTE 2.0 is significantly more accurate than the current state of the art on 112 univariate UCR archive datasets and 26 multivariate UEA archive datasets.

HIVE-COTE 2.0: a new meta ensemble for time series classification

TL;DR

This work proposes comprehensive changes to the HIVE-COTE algorithm which significantly improve its accuracy and usability, and introduces two novel classifiers, the Temporal Dictionary Ensemble and Diverse Representation Canonical Interval Forest, which replace existing ensemble members.

Abstract

The Hierarchical Vote Collective of Transformation-based Ensembles (HIVE-COTE) is a heterogeneous meta ensemble for time series classification. HIVE-COTE forms its ensemble from classifiers of multiple domains, including phase-independent shapelets, bag-of-words based dictionaries and phase-dependent intervals. Since it was first proposed in 2016, the algorithm has remained state of the art for accuracy on the UCR time series classification archive. Over time it has been incrementally updated, culminating in its current state, HIVE-COTE 1.0. During this time a number of algorithms have been proposed which match the accuracy of HIVE-COTE. We propose comprehensive changes to the HIVE-COTE algorithm which significantly improve its accuracy and usability, presenting this upgrade as HIVE-COTE 2.0. We introduce two novel classifiers, the Temporal Dictionary Ensemble (TDE) and Diverse Representation Canonical Interval Forest (DrCIF), which replace existing ensemble members. Additionally, we introduce the Arsenal, an ensemble of ROCKET classifiers as a new HIVE-COTE 2.0 constituent. We demonstrate that HIVE-COTE 2.0 is significantly more accurate than the current state of the art on 112 univariate UCR archive datasets and 26 multivariate UEA archive datasets.

Paper Structure

This paper contains 15 sections, 21 figures, 11 tables, 4 algorithms.

Figures (21)

  • Figure 1: Critical difference diagram for HC2 against the current state of the art on 112 UCR TSC problems. The average rank for each classifier is shown, and solid lines group classifiers between which there is no significant difference. It demonstrates that there is no difference between HC1 bagnall20hivecote1, InceptionTime fawaz20inception, ROCKET dempster20rocket and TS-CHIEF shifaz20ts-chief, but HC2 is significantly higher ranked than all of them. More details are given in Section \ref{['sec:results']}.
  • Figure 2: An overview of the ensemble structure of HIVE-COTE 2.0 for a three class problem. Each module is trained independently and produces an estimate of the probability of membership of each class for unseen data. The control unit (CAWPE) combines these probabilities, weighted by an estimate of the quality of the module found on the train data.
  • Figure 3: Results of five dictionary based classifiers on 106 of the UCR datasets. The missing datasets are: ElectricDevices; FordA; FordB; HandOutlines; Non-InvasiveFetalECGThorax1; and NonInvasiveFetalECGThorax2. These are missing due to the long run time of S-BOSS and WEASEL. cBOSS samples 250 parameter sets and has an ensemble size of 50. WEASEL $\chi$ is set to 2.
  • Figure 4: Critical difference diagram for five interval based classifiers on 112 UCR datasets. Each classifier builds 500 trees. TSF and CIF extract sqrt($m$) intervals per tree. CIF subsamples 8 attributes per tree.
  • Figure 5: Critical difference diagram for both versions of ROCKET and versions of HIVE-COTE using them on 112 UCR datasets. HC2-Ar1H represents HIVE-COTE using the Arsenal classifier with probabilities generated in the same way as ROCKET.
  • ...and 16 more figures